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There’s a time to be critical 


Anaccusation that referees are too demanding and editors too supine demands a response. 
Authors, editors and referees all have lessons to learn. 


of a paper currently under assessment: 

“T guess the issue with this kind of paper is that there are an 
almost limitless number of changes/additions that could be made, 
especially considering the complexity of the data presented here. I 
suspect that this paper might run into a few reviewer ‘issues as it cov- 
ers so much ground. In my review I have tried to be cognisant of your 
27 April Nature article (“End the wasteful tyranny of reviewer experi- 
ments’) and as such give this a ‘yes’ vote pending revisions.” 

In the same week, we received a note from another reviewer to the 
effect that the “tyranny of reviewer experiments” had significantly 
increased the impact of the claims made in a manuscript he assessed, 
and he hoped that the authors would agree that the further work was 
worth the effort. 

Clearly, some targets of the Nature article have taken note of it. In 
brief, that column, by Hidde Ploegh at the Massachusetts Institute 
of Technology in Cambridge, argued that referees too often ask for 
more experiments, and that editors too passively tend to pursue such 
requests (see Nature 472, 391; 2011). 

But for the paper mentioned above, the question of whether further 
work is required is still open until the editor decides otherwise. Our 
editors must ask themselves: would further work lift the paper over 
a threshold of robustness or significance that justifies publication in 
Nature, or is it already sufficient? And have other referees differing 
views about this? 

In resolving these questions, the editor will discuss the paper with 
colleagues and also with the referees. 

The accusation that editors are too passive was not specifically 
directed at Nature, but we take it seriously. We could too easily dis- 
count it on several grounds. Surveys of our published authors, as 
well as general surveys of scientists conducted independently, over- 
whelmingly support the view that papers have gained in their passage 
through peer review. Critics do not realize how much discussion and 
critical assessment underpins our editorial decisions. And without 
question, the ever-increasing pressure to publish is far too often lead- 
ing authors to submit papers that would gain substantially in scientific 
significance with some further work. 

It is important also to acknowledge that our referees generally 
put in very substantial amounts of labour on behalf of their fellow 
scientists, and make constructive suggestions that ensure that some 
of the extraordinary claims that Nature publishes are backed by the 
necessary evidence. 

Nevertheless, a more reflective response is also required. 

At Nature and at the Nature research journals, our teams of staff 
editors are expected to make their own conclusive judgements about 
a paper’s position below or above their journal’s threshold, and will 
often overrule referees’ expectations in this respect in either direc- 
tion. For example, we may decide that even if a paper lacks a new 


| ast week one of our editors received the following from a referee 


insight into mechanism, it represents a sufficient resource in the 
novelty of its data or technique to make a significant impact on the 
discipline. Conversely, we may decide that an additional piece of work 
would greatly increase a paper's range or depth of impact, and make 
that a condition of publication — we hope to the ultimate benefit of the 
community and the authors themselves (see Nature 463, 850; 2010). 
But our editors do not necessarily have the 


“Referees expertise to judge whether, for example, an 
generally put application of a novel technique or reagent 
mvery has been adequately validated. Authors are 
substantial free to challenge a request for more work in 
amounts of these circumstances, and an editor may seek 
labour on technical advice from another expert to resolve 
behalf of the matter. 

their ‘fellow Spurred by this discussion, we looked back 


at recent decisions. We soon found several 
cases in which, with technical guidance where 
necessary, we overruled a referee's request for additional work — for 
example, when the editor felt that, contrary to a referee’s assertion, 
the gain in robustness would not be sufficient to justify the effort 
and delay. 

What lessons can be learnt, therefore? By authors: in the interests of 
robustness and genuine impact, resist the pressure to publish prema- 
turely. By editors everywhere: dont be supine in the face of referees 
requests. 

And above all, by referees: please don’t ignore any impulse to 
demand more, but be self-critical too. m 


scientists.” 


Getting personal 


Targeted therapies work, but need help to fulfil 
their potential. 


mission meeting on personalized medicine in Brussels heard 
last week: they are both complex and neither is properly 
understood. The view struck a chord with attending scientists and 
health-care economists, who felt that personalized medicine should 
be happening, and didn't understand why, mostly, it isn't. 
Personalized medicine aims to use the latest genomic knowledge 
and technologies to tailor treatments to individuals. Pivotal to the 
field are drugs that have been designed to hit a particular molecular 
pathway that has gone wrong in a disease. The European Medicines 
Agency has already approved around 15 such drugs for cancer therapy 


B iology is like economics, participants at a European Com- 
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and is set to approve several more in the next year or so. 

The personalized approach faces two major problems: complex 
biology and complex economics. The pathway involved is often 
not well understood, and most targeted drugs are so expensive that 
health-care systems and insurance companies don’t want to pay for 
them, even if they reduce waste and should therefore save on overall 
treatment costs in the long term. The drug gefitinib, for example, 
costs around €20,000 (US$28,000) per patient and targets the EGFR 
pathway, which is disrupted in fewer than 15% of patients with lung 
cancer. What's more, targeted drugs need to be accompanied by diag- 
nostic tests to identify suitable patients, yet many health-care systems 
have no mechanism to pay for the tests. The result is an absurd situ- 
ation in which expensive drugs can be prescribed without testing, 
and therefore to some patients who will gain no benefit. 

As arguments about the value of personalized medicine rage 
around the world, France has found its own solution — at least for 
cancer, where molecular medicine is most advanced. In 2005, the 
country said it would pay for the treatment of every citizen shown to 
be likely to benefit from targeted drugs. Its National Cancer Institute 
set up 28 platforms for molecular genetics at university hospitals and 
cancer centres with expertise in both molecular and pathological 
analysis. Biopsies of cancerous tissue from patients all over France 
are sent to these platforms for a battery of 20 or so genetic tests. 
If the tissue displays a genetic signature in any molecular pathway 
targeted by one of the drugs, the patient gets treated with it. The 
platforms develop the tests themselves, and are already working ona 
test to accompany a drug that researchers hope will be approved this 
year for melanoma. Targeted drugs now account for 57% of France's 
cancer-treatment budget. The Czech Republic has a similar system. 

The model seems to work. The French platforms have so far tested 
samples from around 15,000 people with lung cancer for alterations 
in the EGFR pathway. Just over 1,700 patients tested positive and 
were given gefitinib until they stopped responding (an average of 


38 weeks). That has cost France €35 million. Had all 15,000 patients 
been given an eight-week course of gefitinib just to see whether they 
would respond, it would have cost the nation another €69 million 
— with no extra benefit. 

Some assessments, however, have concluded that personalized 
drugs do not offer enough benefit to justify the cost. It will not be easy 
to persuade the spectrum of state health systems and health-insur- 
ance companies that personalized medicine makes economic sense. 
Understandably, they will want a lot more evidence that it works. 

Much reluctance also seems to come from a medical profession 

unused to needing genetic tests to select 


“As arguments patients and from inflexible bureaucratic 
about the value systems. The European Commission’s 
of personalized health directorate could help by encourag- 
medicine ra ge ing European countries to harmonize their 

dthe health-technology assessments, or even by 
shard issuing its own (non-binding) conclusions 
world, Fran iad on which targeted drugs it considers cost- 
has found its effective. And the commission’s research 


.) ” 
own solution. directorate could provide greater support 


for efforts to translate the results of pre- 
clinical research on molecular pathways into the clinic, which it 
plans to do in its 2012 call for proposals. 

Amid the excitement and attention paid to cancer, it is crucial 
to remember that other conditions — such as psychiatric disor- 
ders — carry just as great a societal burden, yet remain too poorly 
understood to benefit. The research directorate has enabled a great 
deal of fundamental research on animal models designed to under- 
stand such complex conditions, and it must continue to do so, in 
parallel with its translational efforts. We are at the beginning of 
personalized medicine in the clinic. But we are also just starting to 
understand the mechanisms behind most of the diseases that are 
likely to gain the most. = 


The human epoch 


Official recognition for the Anthropocene 
would focus minds on the challenges to come. 


better to decide on one of the more profound debates of the 

time: does human impact on the planet deserve to be offi- 
cially recognized? Are we living in a new geological epoch — the 
Anthropocene? 

This is no idle conundrum. Although the term has long been 
used informally to refer to the current, haman-dominated phase of 
Earth’ history, a working group of the International Commission on 
Stratigraphy, the body that defines the divisions of geological time, is 
studying the case for making it official (see Nature 473, 133; 2011). 

The Anthropocene would be a peculiar addition to the geological 
timescale. So far, it is more a prediction than a fact of Earth’s history, 
because many of its defining features are only starting to register in 
the rock record. And the driving force behind the geological transi- 
tion it labels is not a continental rearrangement, massive volcanism 
or an extraterrestrial impact — forces that have reshaped the planet 
in the past. Yet the Anthropocene does deserve proper recognition. 
It reflects a grim reality on the ground, and it provides a powerful 
framework for considering global change and 
how to manage it. 

Human activity is set to leave an indelible 
mark on the geological record. Deforestation, 
mining and road building have unleashed tides 


(Gece are used to dealing with heavy subjects, so who 
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of sediment down rivers and onto the ocean floor. Fossil-fuel use and 
land clearance have already emitted perhaps a quarter as much carbon 
into the atmosphere as was released during one of the greatest plan- 
etary crises of the past, the Palaeocene-Eocene Thermal Maximum 
55 million years ago. Now, as then, corals and other organisms are 
recording a global carbon-isotope shift. The increasing acidification 
of the oceans as they absorb carbon dioxide will dissolve carbonate 
from deep sediments, and what is likely to be the sixth great mass 
extinction in Earth’s history will gather speed, adding vivid new 
markers to the record. 

But is it too soon to declare an end to the Holocene, the stable, 
largely benign epoch that has lasted just 11,700 years — a heartbeat in 
geological time? What impact will an official change in the geological 
timescale have on the funding and status of Holocene studies? And is 
it wise for stratigraphers to endorse a term that comes gift-wrapped 
as a weapon for those on both sides of the political battle over the 
fate of the planet? 

The scale of the changes already under way and the real value of a 
unified approach to studying human influences on the planet should 
surely quash these concerns. The Anthropocene is defined not just 
by climate change or extinctions, but by a linked set of effects on 
Earth and its biosphere, from perturbations in the nitrogen cycle to 
the dispersal of species around the globe. Official recognition of the 
concept would invite cross-disciplinary science. And it would encour- 
age a mindset that will be important not only to fully understand the 
transformation now occurring but to take action to control it. 

Humans may yet ensure that these early years of the Anthropocene 
are a geological glitch and not just a prelude to a far more severe dis- 
ruption. But the first step is to recognize, as the term Anthropocene 
invites us to do, that we are in the driver's seat. m 
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WORLD VIEW perninicossen 


y country, Brazil, is home to 80% of the remaining Amazon 
M rainforest, and has rightly worked to find ways to sustainably 

manage and commercialize these stocks of forest carbon. 
However, like most countries with long coastlines, Brazil has so far 
missed the opportunity to value and protect another important carbon 
store: its mangroves, seagrasses and tidal marshes. 

The 9,000-kilometre vibrant and productive Brazilian coastline is 
covered with vegetated ecosystems that together contain hundreds of 
millions of tonnes of such carbon, at least. Brazil is home to the third- 
largest mangrove area in the world and has more than 20,000 hectares 
of seagrasses near tropical reefs and in coastal lagoons. 

Why arent these systems recognized as vital pieces of the climate- 
change puzzle? They cover just 0.5% of marine 
areas, but are among the largest carbon sinks in 
the ocean. Typically, they store up to 15 times 
more carbon per hectare than terrestrial soils, 
absorbed over hundreds or even thousands of 
years. And these coastal systems sequester carbon 
10-50 times faster than terrestrial forests. 

As the United Nations REDD+ scheme to pro- 
tect the ‘green’ carbon stocks in tropical forests 
develops, it is time to broaden the reach of such 
mechanisms so that they also value and protect 
coastal stocks of ‘blue’ carbon. 

In February, I took part in a scientific work- 
shop in Paris to evaluate such mechanisms and 
offer policy-makers the information and advice 
they need to make them happen. The event was 
a meeting of the International Working Group 
on Coastal Blue Carbon, formed jointly by Con- 
servation International, The International Union 
for Conservation of Nature and the Intergovernmental Oceanographic 
Commission of the UN Educational, Scientific and Cultural Organi- 
zation. It concluded that coastal carbon deposits should be taken into 
account in national emission inventories and in the processes and 
mechanisms of the UN climate framework. The group will continue to 
meet over the next two years and will urge policy-makers to recognize 
the importance of blue carbon. 

They must do this soon, because coastal areas are among the 
most threatened ecosystems on Earth. Between 30% and 50% of 
mangroves have disappeared in the past 50 years; about 30% of the 
world’s seagrasses are gone; and half of the global coverage of salt 
marshes has been destroyed. That loss is continuing and in many 
places accelerating — 2% of those important coastal systems are lost 
each year. That is four times the estimated rate 
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Add coastal vegetation to 
the climate critical list 


Forests are protected, but carbon sinks in mangroves, seagrasses and marshes 
are ignored. Margareth da Silva Copertino wants Brazil to change that. 


Durham, North Carolina, estimate that emissions from such clearing 
result in up to 900 million tonnes of carbon dioxide emissions per year, 
roughly equal to the annual CO, emissions from energy consump- 
tion and industry for the whole of Germany. That is about 10-20% of 
the emissions from deforestation globally, or 2% of all anthropogenic 
greenhouse-gas emissions. 

Brazil would be a good place to test new mechanisms to value and 
conserve blue carbon. The country has about 200 protected areas 
along the coastline, spanning different latitudes and ecosystems. 
But these cover just 20% of the country’s total coastal territory, and 
represent only one-quarter of the area highlighted by the Brazilian 
government as a conservation priority. The country’s mangroves, 
salt marshes and seagrasses are under mounting 
pressure from a combination of intense human 
activities, increasing coastal development (about 
one-fifth of the Brazilian population lives by the 
coast), agricultural run-off, pollution and inten- 
sive aquaculture. These rising threats require the 
urgent application of mechanisms to increase the 
monetary value of coastal habitats, promote their 
conservation and avoid further degradation. 

There is more to protecting coastal ecosystems 
than global recognition of their importance, 
however. We need focused and coordinated 
research, as well as scientific data collection, 
to build financial mechanisms that value these 
ecosystems as tools to reduce greenhouse-gas 
emissions. 

But local policies and regulation are vital 
too. Developing nations must support small 
landowners and the livelihoods of local com- 
munities, expand the law-abiding ‘responsible’ fraction of economic 
sectors, improve law enforcement, effectively manage protected areas 
and recover the many degraded ones. In Brazil, as in other countries, 
we need to stop the destruction of existing mangroves, tidal marshes 
and seagrasses. 

Effectively accounting for the carbon in coastal systems has the 
potential to transform the management and conservation of coastal 
areas on both the global and local scales. For the sake of Brazil and the 
world, my country should push for the role of oceans and their coastal 
ecosystems to be included in UN climate talks this year, alongside 
rainforests. Brazil has already played an important part in many of 
these discussions, but it could be a true leader if it were to recognize 
the significant potential hidden in its long, blue coast. m 


Margareth da Silva Copertino is a lecturer in biological 
oceanography at the Institute of Oceanography, Federal University of 
Rio Grande (FURG), Rio Grande, Brazil. 

e-mail: doccoper@furg. br 
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Selections from the 
scientific literature 


RESEARCH HIGHLIGHTS 


J.J. ALLEN 


Starved cells turn 
on themselves 


Bacteria are so essential to 
mammalian digestion that 
without them, some gut 

cells break down their own 
components to obtain energy. 

Scott Bultman at the 
University of North Carolina at 
Chapel Hill and his colleagues 
studied the impact of gut 
bacteria on the metabolism of 
mice. They found that in mice 
lacking any bacteria, colon 
cells are energy deprived and 
undergo autophagy, or ‘self- 
eating. Putting bacteria that 
produce butyrate — colon cells 
main energy source — into the 
guts of these mice returned the 
cells’ metabolism to normal. 

A decrease in butyrate- 
producing bacteria in the gut, 
perhaps caused by dietary 
changes, could compromise 
colonic function, the authors 
say. This might contribute to 
higher rates of inflammatory 
bowel disease and colon cancer. 
Cell Metab. 13, 517-526 (2011) 


Strike a pose 
and hide 


Cuttlefish evade predators 
by matching not only their 
colours and patterns to the 
background, but also their 
postures. 

Roger Hanlon and his 
team at the Marine Biological 
Laboratory in Woods Hole, 
Massachusetts, presented the 
common European cuttlefish 


Warblers of the underwater world 


Many birds, mammals and amphibians vary the 
frequency and intensity of their vocalizations 
to expand their vocabulary. Aaron Rice, Bruce 
Land and Andrew Bass at Cornell University in 
Ithaca, New York, show that fish also use forms of 
‘acoustic nonlinearity, such as frequency jumps 
and biphonation — the simultaneous expression 
of two independent frequencies. 

The authors recorded and analysed the vocal 
calls of three-spined toadfish (Batrachomoeus 


(Sepia officinalis; pictured left) 
with separate backgrounds 
containing stripes at different 
angles: horizontal, vertical, 
and diagonal. The animals 
raised their limbs to match the 
angles of the stripes, but didn't 


respond to a blank background. 


When placed next to 
artificial algae in the lab, the 
animal strucka 
pose to mimic 
its neighbour 
(right). Similar 
behaviour was 
observed in 


The authors 
suggest that 
visual cues are 
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natural habitats. 


trispinosus; pictured), which produce ‘hoots’ and 
‘grunts’ by vibrating their swim bladders. Around 
35% of the fish's calls had at least one form of 
nonlinearity. Severing the animals’ vocal motor 
nerve stopped them producing these effects. 

The fact that fish make complex vocalizations 
previously found only in four-limbed vertebrates 


suggests that there is a major selection pressure 


important for such creatures 
to adopt cryptic body postures 
and achieve maximum stealth. 
Proc. R. Soc. B doi:10.1098/ 
rspb.2011.0196 (2011) 


Designer proteins 
target flu 


Proteins that bind to the 1918 
pandemic influenza virus have 
been designed using computer 
modelling. 

The viral surface protein 
haemagglutinin is essential 
to the flu virus’ infection of 
human cells, making it an 
attractive drug target. David 
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to produce innovation in acoustic signals. 
Proc. R. Soc. B doi:10.1098/rspb.2011.0656 (2011) 


Baker at the University of 
Washington in Seattle and his 
colleagues computed ‘hot spots 
— protein residues that can 
interact with haemagglutinin 
— from the 1918 virus on the 
basis of properties such as the 
predicted strength of their 
interaction with this protein. 
They then used computer 
algorithms to search a set of 
865 protein structures for ones 
that could incorporate these 
hot spots, and came up with 88 
proteins able to acquire at least 
two of them. When expressed 
in yeast, two of these proteins 
bound to haemagglutinin. 
After additional optimization, 
the researchers solved the 


> 
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molecular structure of one 
of these tailored proteins 
while it was bound to the flu 
haemagglutinin, revealing 
atomic-level accuracy in the 
designed interaction. 
Science 332, 816-821 (2011) 


IMMUNOLOGY 


Blocking brain 
inflammation 


Harmful brain inflammation 
triggered by a subset of 
immune cells can be quelled 
by the action of a hormone on 
an oestrogen receptor. 

Microglia are immune cells 
that trigger inflammation in 
the central nervous system and 
carry an oestrogen receptor 
called ERB. Christopher 
Glass and Kaoru Saijo at the 
University of California, San 
Diego, and their team screened 
a panel of molecules that bind 
to ERG for their ability to block 
inflammation in microglia. 
They found that a few 
synthetic chemicals, as well 
as a natural steroid hormone 
called ADIOL, activate 
ER, kicking off a cascade 
of reactions that ultimately 
prevents inflammation. 
ADIOL also protects mice 
from an autoimmune 
condition similar to multiple 
sclerosis. 

The authors suggest that 
drugs that stimulate this 
pathway could be used to 
treat neurodegenerative and 
autoimmune diseases. 

Cell 145, 584-595 (2011) 


Awakeninga 
fault line within 


A large fault section off 
Sumatra that had been 
seismically dormant for more 
than 30 years has recently 
reawakened, thanks to a series 
of large earthquakes in the area 
during the past decade. 

Kelly Wiseman at the 
University of California, 
Berkeley, and her team linked 
data derived from the Global 
Positioning System on surface 
motions from Sumatran 
stations with the known 


geometry and mechanisms 
of recent quakes. They found 
that a 900-kilometre-long 
‘backthrust’ — arising from the 
longer Sunda megathrust fault 
that caused the 2004 Indian 
Ocean earthquake and tsunami 
— produced a moderate quake 
in 2005 and another in 2009. 
Ifa rare faulting event were 
to rupture the newly active 
thrust, it could produce a quake 
on the order of magnitude 8.5, 
and, potentially, a large 
tsunami, the authors suggest. 
Geophys. Res. Lett. doi:10.1029/ 
2011GL047226 (2011) 


Achieving spin 
control 


By combining an optical 
microscope with an atomic 
force microscope (AFM), 
researchers have imaged 
individual electronic spins 
with high resolution. 

Amir Yacoby and his team 
at Harvard University in 
Cambridge, Massachusetts, 
implanted clusters of nitrogen 
ions into a diamond sample, 
creating individual spins. 
They applied a magnetic 
field gradient to the spins by 
passing the magnetized tip 
of the AFM over the sample. 
This allowed them to drawa 
map of the sample’ individual 
spins in three dimensions. The 
authors showed that it would 
be possible to resolve spins just 
9 nanometres apart. 

The system could be used 
for studies of fundamental 
physics, because the set-up 
allows quantum control and 
manipulation of individual 
spins. 

Nature Phys. doi: 10.1038/ 
nphys1999 (2011) 


Taming psoriasis 
with vitamin D 


Vitamin D may ameliorate the 
symptoms of the inflammatory 
skin disease psoriasis by 
enhancing the production 

ofa molecule that blocks the 
assembly of inflammatory 
complexes in the skin. 
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COMMUNITY 


The most viewed 
papers in science 


Pp NEUROSCIENCE, 
Asearch for depression genes 


A genome-wide analysis of more than 
15,000 people has revealed an association 
between a gene and major depression. 

Martin Kohli and Elisabeth Binder at the 
Max Planck Institute of Psychiatry in Munich, Germany, and 
their colleagues first compared the genomes of 353 patients 
with depression with those of 366 controls. They teased out a 
gene, SLC6A 15, that was strongly associated with depression, 
and went on to replicate this finding in six other independent 
groups of patients. The gene encodes a transporter protein 
that moves certain amino acids across the cell membrane of 
neurons and may be involved in regulating the transmission 
of glutamate, a neurotransmitter. 

The gene variant linked with depression was associated with 
reduced SLC6A15 expression in the human hippocampus, as 
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well as decreased volume of this brain region. 


Neuron 70, 252-265 (2011) 


Jiirgen Schauber and 
Robert Besch at the Ludwig 
Maximilian University in 
Munich, Germany, and their 
co-workers found that, under 
certain conditions, DNA in the 
cytosol of cultured skin cells 
activates immune complexes 
called inflammasomes that 
contain the protein AIM2. 
Elevated levels of this DNA 
and AIM2 expression were 
also found in skin cells from 
people with psoriasis. When 
normal skin cells were treated 
with the antimicrobial peptide 
cathelicidin LL-37, whose 
production in the skin is 
controlled by vitamin D, the 
peptide bound to cytosolic 
DNA, inhibiting the formation 
of AIM2-containing 
inflammasomes. 

Stimulating cathelicidin 
production may be a promising 
approach for treating psoriasis, 
the authors suggest. 

Sci. Trans. Med. 3, 82ra38 (2011) 


Diamond lighter 
than a feather 


Aerogels are extremely porous 
and lightweight materials 
with a large surface area and 
many potential applications. 


Peter 
Pauzauskie, 
now at the 
University of 
Washington 
in Seattle, 
and his 
colleagues 
have created 
a diamond 
version of 
the material 
(pictured), by 
squeezing an aerogel of 
amorphous carbon until 

it took on a crystalline 
structure. 

The authors used high- 
pressure neon gas to fill and 
support the delicate carbon 
structure. They then zapped it 
with a laser that compressed 
and heated the gel, probably 
to more than 1,600 kelvin, 
until it became diamond. 

Diamond aerogels could 
be useful as antireflective 
coatings, thermal conductors 
and other materials, the 
authors say. 

Proc. Natl Acad. Sci. USA 
doi:10.1073/pnas.1010600108 
(2011) 
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SEVEN DAYS 


Nuclear plans in flux 


Uncertainty over the future 
of nuclear power grew in 
several countries last week. 
Japan’s prime minister Naoto 
Kan said his country would 
review its nuclear policy 

(for more, see page 263). In 
Germany, a leaked draft report 
from an ethics commission 
on safe energy, set up by 
Chancellor Angela Merkel, 
recommended shutting all 
nuclear plants by 2021. In the 
United States, inspections of 
nuclear plants by the Nuclear 
Regulatory Commission 
threw up a number of flaws. 
And European nuclear-safety 
regulators met in Brussels 

to announce details of stress 
testing for nuclear plants, but 
could not agree on criteria. 


Fisheries reform 
Drafts of the European 
Commission's proposed 
overhaul of Europe’ fishing 
industry were leaked to 

the media last week. The 
commission wants to cut catch 
quotas so that by 2015, stocks 
are fished at the maximum 
yield that is sustainable. It also 
hopes to let fishermen buy 
and sell catch quotas, and to 
ban the practice of throwing 
back some caught fish. Some 
scientists worry about the 
expense of the research base 
and enforcement required 

to sustain and police sucha 
system. The final proposal will 
be presented on 13 July. See 
go.nature.com/edjtge for more. 


US retains brains 
Ina bid to keep talented 
overseas researchers, the 

US Department of Homeland 
Security is giving more 
foreign science students 

extra time to search for work 
after graduation. Graduates 
can usually apply to stayin 
the United States for a year, 
receiving related postgraduate 


The news in brief 


Shale-gas fracking faces French ban 


When the Oscar-nominated documentary 
Gasland showed people setting their tap water 
on fire (pictured), it thrust hydraulic fracturing 
into the spotlight. The technique, also known 
as ‘fracking’, in which high-pressure fluids 

are pumped into shale formations to fracture 
the rock and force out natural gas, has been 
accused of releasing methane into well water 
(hence, perhaps, the flammable tap water) and 
of polluting groundwater with toxic chemicals. 


training, before having to 
switch visas or leave the 
country. But on 12 May, the 
department added a range of 
science-related subjects — 
including neuroscience, drug 
design and environmental 
science — to a list of degrees 
eligible for a 29-month 
postgraduate stay. 


Openness inquiry 
The Royal Society in London is 
asking whether scientific data 
should be shared more openly 
with the public. On 13 May, it 
launched an inquiry into how 
scientific information should 
be managed “to improve the 
quality of research and build 
public trust”. A working group 
chaired by Geoffrey Boulton, 
a glaciologist at the University 
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for more. 


of Edinburgh, UK — and 
including the editor-in-chief 
of Nature — will release its 
conclusions in early 2012. It 
is accepting submissions of 
evidence until 5 August. 

See go.nature.com/w2di6n 
for more. 


IPCC overhaul 


The Intergovernmental Panel 
on Climate Change last week 
agreed to change its workings 
and governance. The reforms 
were decided at a general 
assembly in Abu Dhabi. See 
page 261 for more. 


Arctic land rush 
Cooperation, not conflict, 
was the message relayed 
by politicians at the Arctic 
Council’s biennial meeting 
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Now France could become the first country 

to ban the practice. On 11 May, the French 
parliament's lower house voted for a ban; the 
upper house will vote next month. Several 
areas in the United States have recently issued 
moratoriums on fracking, and a panel set up on 
5 May by Department of Energy head Steven 
Chu will report in three months on how to 
improve its safety. See go.nature.com/wkizub 


in Nuuk, Greenland, on 

12 May. The rush to claim 
seabed territory and oil and 
gas resources in the melting 
Arctic threatens to spark 
international disputes. But 
US secretary of state Hillary 
Clinton described the 
council as the ‘preeminent 
intergovernmental body’ 

for solving problems. The 
meeting saw the signing of the 
council's first legally binding 
treaty between Arctic nations 
— although it relates only to 
cooperation on search and 
rescue missions in the region. 


Korean science belt 
South Korea’s 5.2-trillion-won 
(US$4.8-billion) ‘science 

belt project will be based — 
unsurprisingly — in a region 
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already home to the country’s 
leading science university, the 
Korean Advanced Institute 

of Science and Technology, 
and numerous industrial 
laboratories. Korean media 
said a site-selection committee 
had chosen the Daedeok 
research district, in the city 

of Daejeon, to host a planned 
410-billion-won rare-isotope 
accelerator and 25 of the 

50 laboratories of a new 
basic-science institute. The 
infrastructure is planned to be 
built by 2017. 


} FUNDING 
Low odds at the NIH 


Fewer than one in five research 
grant applications to the US 
National Institutes of Health 
(NIH) will gain funding in the 
2011 fiscal year, according to 
Francis Collins, the agency's 
director. Predicted success 
rates of 17-18% would be 

“the lowest in history’, Collins 
told a Senate committee on 

11 May. In 2010, the NIH’s 
grant-application success rate 
was 20%. See go.nature.com/ 
g9nh4t for more. 


UK facilities cuts 


The UK government has 
revealed cuts to planned 
scientific facilities, part of an 
effort to shrink the national 
deficit. Three projects have 
definitely been axed: a national 
supercomputing service called 
ARCHER;a planned upgrade 


TREND WATCH 


The number of drug candidates 
entered into early clinical trials 
increased from 2000 to 2009, but 
that hasn't led to more drugs in the 
clinic. Success rates for late-stage 
(phase II and phase IJ) trials 

are falling, says the Centre for 
Medicines Research in London. 
John Arrowsmith, a scientific 
director at Thomson Reuters, 
which owns the centre, says the 
attrition rate is “unsustainably 
high” (Nature Rev. Drug Discovery 
10, 328; 2011). A clear-out of weak 
candidates in 2009 may improve 
future success rates, he adds. 


to the Rothera Research 
Station in Antarctica; anda 
£50-million (US$80-million) 
centre for computational 
science at the Daresbury 
Science and Innovation 
Campus in Cheshire. 


EVENTS 


Iran goes nuclear 


Self-sustaining nuclear 
reactions have begun inside 
Iran’s first commercial 
nuclear power plant. On 

10 May, Atomstroyexport, 

a Russian state-owned firm 
building the Bushehr nuclear 
plant (pictured), said it had 
begun power tests of the 
915-megawatt pressurized- 
water reactor. Construction 
was begun in 1975 by Siemens 
but was suspended after the 
1979 Islamic Revolution. 
Russia resumed construction 
in 1995. 


Reactor meltdown 


The unit 1 reactor at Japan’s 
Fukushima Daiichi nuclear 
plant melted down entirely 
after a massive earthquake and 
tsunami struck on 11 March, 
according to analysis from 


the plant’s owners, the Tokyo 
Electric Power Company. 

Data provided by recalibrated 
equipment inside the reactor 
indicates that the fuel rods had 
lost their surrounding coolant 
four and a half hours after the 
tsunami arrived. Most of the 
fuel had probably already fallen 
to the bottom of the vessel by 
the time it was flooded with 
sea water. The full meltdown 
will complicate future clean-up 
efforts. See go.nature.com/ 
frm7uk for more. 


Shuttle launch 
NASA’s space shuttle 
Endeavour launched from 

the Kennedy Space Center in 
Florida for its final flight on 
the morning of 16 May. The 
trip is the penultimate mission 
of the shuttle fleet. See page 
262 for more. 


Chilean dams 

Officials in Chile last week 
approved the construction of 
five hydropower dams across 
two major rivers in Patagonia 
— part of the US$7-billion 
HidroAysén project, which 
aims to generate 2.75 gigawatts 
of power for Chile. But violent 
demonstrations followed 

the approval of the dams, 

and scientists have criticized 
the environmental-impact 
assessment used to justify the 
scheme. Chilean electricity 
utilities Colbun and Endesa 


PHARMA’S FALLING SUCCESS RATE 


Although more drugs were pushed into clinical trials over the 
past few years, success rates at key stages declined. 
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SEVEN DAYS | THIS WEEK | 


22-26 MAY 

The American 
Astronomical Society 
meets in Boston, 
Massachusetts. 
go.nature.com/q4otmo 


22-27 MAY 

Atits general meeting 

in Paris, the World 
Organisation for Animal 
Health will celebrate the 
global eradication of 
rinderpest, a devastating 
cattle disease. 
go.nature.com/z3n2hj 


still need permission to build 
a transmission line that would 
carry power thousands of 
kilometres from the remote 
Aisén region to Santiago. 

See go.nature.com/gnbvasl 
for more. 


Genomics research 


Mount Sinai School of 
Medicine in New York is 
setting up a genomics research 
institute in collaboration with 
Pacific Biosciences, of Menlo 
Park, California, which is 
developing technology for 
real-time, single-molecule 
DNA sequencing. The 
company’s chief scientist, the 
charismatic computational 
biologist Eric Schadt, will 
direct the new institute and 
retain his position with Pacific 
Biosciences, the Mount Sinai 
School announced on 16 May. 


Hepatitis milestone 
On 13 May, the US Food 

and Drug Administration 
approved the first drug to 
directly target the hepatitis C 
virus. The drug, boceprevir 
(Victrelis), is made by Merck, 
based in Whitehouse Station, 
New Jersey. Another drug 
telaprevir (Incivek) — made 
by Vertex Pharmaceuticals in 
Cambridge, Massachusetts — 
is expected to be approved by 
23 May. 
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Final shuttle Universities take A fight to the 
flights leave NASA energy savings to an finish at a top cancer 
facing a void p.262 extreme p.263 centre p.264 


California’s 
lonely struggle to cut 
carbon p.268 | 


CLIMATE CHANGE 


IPCC chairman Rajendra Pachauri faced calls to quit after errors were found in a key report. 


Major reform for 
climate body 


Intergovernmental panel aims to become more responsive. 


BY QUIRIN SCHIERMEIER 


fter months of soul-searching, the 
A tcerermenal Panel on Climate 
Change (IPCC) has agreed on reforms 
intended to restore confidence in its integrity 
and its assessments of climate science. 
Created as a United Nations body in 1988 
to analyse the latest knowledge about Earth’s 
changing climate, it has worked with thousands 
of scientists and shared the Nobel Peace Prize 
in 2007. But its reputation crumbled when 
its leadership failed to respond effectively to 
mistakes — including a notorious error about 
the rate of Himalayan glacier melting — that 
had slipped into its most recent assessment 
report (see Nature 463, 276-277; 2010). 
That discovery coincided with the furore 
over leaked e-mails from the University of East 


Anglia’s Climatic Research Unit in Norwich, 
UK (see Nature 462, 397; 2009). Some e-mails 
seemed to show that leading climate scientists, 
who had contributed key findings to previous 
IPCC reports, had tried to stifle critics. This put 
the panel — especially its chairman, Rajendra 
Pachauri — under intense pressure. The 
InterAcademy Council, a consortium of 
national science academies, was commissioned 
to review the structure and procedures of the 
IPCC and to suggest improvements to its opera- 
tions (see Nature 467, 14; 2010). 

The council identified the lack of an execu- 
tive body as a key factor in the IPCC’s failure 
to respond to the crisis. 
It also urged the panel 
to improve the trans- 
parency of its assess- 
ments and to make its 


Read more on climate 
controversy at: 
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communication and outreach activities more 
professional. The IPCC adopted several minor 
changes at a meeting last October (see Nature 
467, 891-892; 2010). 

More substantial reforms were signed off 
last week in Abu Dhabi at a meeting of dele- 
gates from IPCC member states. An executive 
committee will be created to oversee the body's 
daily operations and to act on issues that cannot 
wait for full plenary meetings. The 13-strong 
committee will be led by the chairman, and 
includes the vice-chairs and co-chairs of its 
working groups and technical support units. 

A new conflict-of-interest policy will require 
all IPCC officials and authors to disclose finan- 
cial and other interests relevant to their work 
(Pachauri had been harshly criticized in 2009 
for alleged conflicts of interest.) The meeting 
also adopted a detailed protocol for address- 
ing errors in existing and future IPCC reports, 
along with guidelines to ensure that descrip- 
tions of scientific uncertainties remain consist- 
ent across reports. “This is a heartening and 
encouraging outcome of the review we started 
one year ago,” Pachauri told Nature. “It will 
strengthen the IPCC and help restore public 
trust in the climate sciences.” 

The first major test of these changes will be 
towards the end of this year, with the release of 
a report assessing whether climate change is 
increasing the likelihood of extreme weather 
events. Despite much speculation, there is scant 
scientific evidence for such a link — particularly 
between climate warming, storm frequency 
and economic losses — and the report is 
expected to spark renewed controversy. “It'll 
be interesting to see how the IPCC will handle 
this hot potato where stakes are high but solid 
peer-reviewed results are few,’ says Silke Beck, a 
policy expert at the Helmholtz Centre for Envi- 
ronmental Research in Leipzig, Germany. 

The IPCC overhaul is not yet complete. 
Delegates postponed a decision about the 
exact terms of office of the group’s chairman 
and head of the secretariat. Critics say that these 
terms should be strictly limited to the time it 
takes to produce a single assessment report, 
about six or seven years. With no clear deci- 
sion on that issue, Pachauri could theoretically 
remain in office beyond 2014, when the next 
full report is due for release. 

But the Indian economist says he has not 
considered staying on that long. “My job is to 
successfully complete the next assessment,’ he 
says. “That's what I’m solely focused on? = 
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The space shuttle Endeavour launched for the last time on 16 May. Only one shuttle flight remains. 


SPACE EXPLORATION 


Shuttle’s end spells 
change at NASA 


As the shuttle flies its penultimate mission, the US space 
agency seeks to filla looming gap in crew transport. 


BY ALEXANDRA WITZE 


ear the Kennedy Space Center in 
Nite where the space shuttles 

thunder into orbit, roadside signs 
reveal the deep ties that the local commu- 
nity feels to the US space programme. The 
ties are both spiritual and economic: church 
signs wish the shuttle Godspeed before each 
launch, and liquor stores tout their selection 
as ‘out of this world. 

That relationship is now heading for an 
extended and painful hiatus. On 16 May, 
after several weeks of delays, the space shut- 
tle Endeavour embarked on the penultimate 
shuttle flight, carrying a large cosmic-ray 
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detector (see Nature 473, 13-14; 2011) to the 
International Space Station (ISS). The final, 
135th launch of the 30-year shuttle pro- 
gramme will take place by late summer, when 
Atlantis is set to take flight to ferry another 
load of astronauts, equipment and supplies 
to the station. 

“Although we're ending the space-shuttle 
programme, we are not ending the nation’s 
human space-flight programme,’ says Philip 

McAlister, acting direc- 

NATURE.CC tor of NASA’s com- 


Seven experts mercial space-flight 
ponder the futureof development pro- 
gramme. “It’s evolving 


NASA: 


into an exciting new 
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paradigm.” Before that new phase begins, 
however, NASA will face years with no crewed 
spaceship of its own, and the Space Coast, the 
region around the Kennedy centre, will lose 
some 8,000 space-related jobs. 

In the short term, NASA will have to buy 
seats aboard Russia's Soyuz capsules, now 
the only way to deliver crew members to the 
ISS. The next such flight, slated for 7 June, 
will carry one astronaut each from Russia, 
Japan and the United States to the station. The 
United States also has places reserved on the 
ten further Soyuz flights that are scheduled 
before the end of 2013. 

Inthe long term, NASA expects to give private 
companies the responsibility of getting astro- 
nauts into low-Earth orbit, an approach cham- 
pioned by US President Barack Obama. On 
18 April, the agency announced that it would 
share US$269 million among four companies 
developing commercial space-flight options: 
SpaceX, of Hawthorne, California; Boeing, of 
Houston, Texas; Blue Origin of Kent, Wash- 
ington; and Sierra Nevada of Louisville, Colo- 
rado. All say that they will begin flying crewed 
spacecraft in 2014-15. 

SpaceX has already flown its unmanned 
Dragon capsule into orbit and recovered 
it successfully, the only private company 
to manage such a feat. NASA is expected to 
decide in the coming weeks whether to com- 
bine the second and third Dragon flights, 
scheduled for later this year, to advance 
directly to a mission that docks with the ISS 
—an option strongly endorsed by Elon Musk, 
chief executive of SpaceX. 

For NASA, which has always relied on its 
own or Russian rockets to get astronauts into 
space, the commercial crew-transportation 
business represents a fundamental shift to 
a different — and untested — way of doing 
things. Many within the agency are upbeat, 
despite the unknowns. 

“The shuttle was one of the ways to go to 
space,’ says Chris Hadfield, an astronaut with 
NASA and the Canadian Space Agency, who 
will spend six months on the ISS next year, 
three as commander. “It was not the only way.” 

Commercial suppliers will also be needed to 
ferry cargo — including research experiments 
— to the ISS. The Soyuz flights can carry very 
little payload, says Tara Ruttley, the station’s 
associate programme scientist. NASA has con- 
tracted for 12 cargo flights with SpaceX and 
8 with Orbital Sciences of Dulles, Virginia, to 
begin as early as next year. Russias unmanned 
Progress resupply ships can also carry research 
payloads up to the station. 

Freed of the need to develop transport for 
low-Earth orbit, the thinking goes, NASA can 
focus on the task handed to it by Congress 
in last October’s NASA Authorization Act. 
By 2016, the agency is supposed to develop 
a heavy-lift rocket and a crew vehicle to send 
astronauts to distant targets such as the Moon 
or near-Earth asteroids. The rocket will draw 
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on work done under the now-defunct Con- 
stellation programme, developed during the 
previous administration as a next-generation 
replacement for the shuttle. 

The new programme reflects the concerns of 
congressional lawmakers who fear the decline 
of NASA and of regions dependent on the jobs 
it has provided, such as the Space Coast (see 
Nature 472, 16; 2011). In an opinion piece pub- 
lished in the Orlando Sentinel on 26 April, US 
Senator Marco Rubio (Republican, Florida), 


chastised Obama for not allocating enough 
money for the heavy-lift vehicle to meet its 
target launch date in 2016. “The bottom-line 
impact of the president’s space agenda is a full 
retreat from America’s long-standing com- 
mitment to space exploration,” Rubio wrote. 
Others echo Rubio’s ire — something that 
could come back to haunt Obama, given Flor- 
ida’s probable role as a key battleground state 
in the 2012 presidential election. 

For shuttle workers facing imminent job 
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loss, commercial flights seem a distant dream. 
Engineers have been dismantling shuttle- 
related equipment at Kennedy’s spare launch 
pad and preparing the already-retired Dis- 
covery shuttle for shipment to its final home 
at a Smithsonian Institution museum near 
Washington DC. For now, the era that is pass- 
ing commands far more attention than the one 
that is promised, as a technical community that 
is used to making space a way of life settles in 
for along stint on the ground. = 
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Japan rethinks its energy policy 


Renewables come to the fore as universities take the lead on electricity conservation. 


BY DAVID CYRANOSKI 


power shortages have forced one of the 

world’s most energy-efficient countries 
to make do with even less. That may become 
the norm after Prime Minister Naoto Kan last 
week shelved a 2010 goal to build 14 nuclear 
reactors over the next 20 years. 

With Japan's energy policy in tatters, 
advocates of renewable energy and efficiency 
savings are seizing the opportunity to argue 
their case. But the measures will have to make 
up a major shortfall: under the previous plan, 
the country’s nuclear generating capacity was 
set to double, to meet half of the nation’s elec- 
tricity needs (see Nature 472, 143-144; 2011). 

Kan had little choice but to change course 
after the accident at the Fukushima Daiichi 
nuclear plant, which was triggered by an enor- 
mous earthquake and tsunami on 11 March. 
Demonstrators have been calling for the 
closure of some or all of Japan’s 54 nuclear 
reactors ever since. In addition to ditching 
plans for new plants, Kan has promised a much 
greater emphasis on efficiency measures and 
renewable sources, which Japan has been rela- 
tively slow to adopt. “Before 11 March, there 
was a black cloud over energy policy, formed 
by industry and the industry ministry. Now 
there’s a crack in that,’ says Tetsunari lida, 
executive director of the Institute for Sustain- 
able Energy Policies in Tokyo, which advises 
the government on renewable energy. 

Last month, the insti- 
tute set out an ambitious 
vision for the country’s 
energy mix. Its plan for 
the Tohoku region calls 
for energy demand to be 
reduced such that all the 
regions needs can be met 
by renewable sources by 


lE the aftermath of the Fukushima disaster, 
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2020. Because Tohoku saw the greatest damage 
from the earthquake, and has a lot of potential 
to exploit wind power, the institute thinks that 
renewables could be introduced quickly while 
creating jobs to regenerate the local economy. 
Countrywide, the institute says that renewa- 
bles’ share of the energy mix should rise from 
about 8% to 30% by 2020, and to 100% by 2050, 
a strategy that requires demand to be halved. “It 


POWER CUT 


The University of Tokyo is using at least 30% 
less electricity than it was just before the 
devastating 11 March earthquake. 


Peak power consumption 
(megawatts) 
w 
to} 


20 

10: 

0) 

A 2 20285 13°21 29° 7 
March> April> May > 


is technically feasible but politically challeng- 
ing,” says lida. Iida helped to draft a bill that 
guarantees utilities a high price for electricity 
from renewables. The bill was, coincidentally, 
finalized by the government on 11 March and 
is expected to become law next month. 

Tatsuo Oyama, an engineer at the National 
Graduate Institute for Policy Studies in Tokyo 
who models electricity investment scenar- 
ios, says that the target might be possible in 
Tohoku. But he warns that the country should 
not become dependent on unpredictable energy 
sources, and that the waning support for nuclear 
energy could be reversed if Japan tackles the 
aftermath of the Fukushima disaster effectively. 
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Tokyos universities are already proving that 
the country still has room for energy savings. 
The University of Tokyo, for example, has cut 
peak power usage by 30-40% by turning off 
lights and air-conditioning, shutting down 
extra lifts and running energy-intensive exper- 
iments at night (see ‘Power cut’). 

Researchers at the university say that their 
low-energy lives are inconvenient, but largely 
manageable. Restricting the use of some equip- 
ment to off-peak hours is “realistic and feasible’, 
says neurochemist Haruhiko Bito, although he 
adds that scheduling researchers’ energy use 
can be time-consuming and depressing. And 
chemist Eiichi Nakamura says that the loss of 
instruments and computer systems has slowed 
research. “The electricity shortage made us 
realize that we can indeed save energy easily by 
10%’; but that 30% cuts will impact productivity 
in the longer term, he says. Others worry that 
the strategy will discourage younger scientists 
by forcing them to work at night. 

The challenge will only intensify in the 
sweltering summer months. Animal facilities 
and sensitive instruments will take priority for 
precious cooled air, while professors and stu- 
dents sweat in rooms with minimal air-condi- 
tioning. Staggered work schedules and holidays 
are being considered to mitigate the effects. 

With Japan’s energy strategy in flux, these 
conservation policies will probably be in place 
for the foreseeable future. “This is tough but, 
in another sense, this has a positive aspect,” 
says Toshio Yamagata, an ocean modeller at 
the University of Tokyo who has had to cope 
with a 30% cut in supercomputer operating 
time. “It is a good occasion for us to realize 
our resources are not infinite.” He adds that 
a greater appreciation of energy could lead to 
more elegant experiments. “We will design 
experiments more carefully and digest the 
results in a deeper way, rather than just obtain- 
ing tonnes of data.” m 
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CNIO director Mariano Barbacid (left) and science minister Cristina Garmendia are at loggerheads. 


‘Soap opera’ sours 
cancer chief hunt 


Spain’s scientists worried by fallout from high-profile spat. 


BY ALISON ABBOTT 


bitter row between a prominent 
Az biologist and the country’s 

science minister has spiralled into a 
potentially damaging leadership crisis at the 
country’s top cancer research institute. 

The Spanish National Cancer Research 
Centre (CNIO) in Madrid is trying to find a 
replacement for molecular oncologist Mariano 
Barbacid, who become the centre’s founding 
director in 1998. But an international com- 
mittee of five high-ranking scientists has now 
withdrawn from the process after its search 
for a director became caught up in the quarrel 


between Barbacid and the minister for science 
and innovation, Cristina Garmendia, who has 
repeatedly clashed with Spanish scientists (see 
‘New path for researchers’). 

Members of the search committee were 
dismayed early this month when their shortlist 
of four candidates was leaked to the press, and 
then concerned when Garmendia’s ministry 
scheduled a meeting of the CNIO’s govern- 
ing board to announce the new director for 
16 May, giving them just ten days to make a 
decision. The top candidate withdrew last 
week, although this may not be connected 
with the row. The governing board is now 
seeking new candidates and deferring further 


SPAIN’S SCIENCE LAW 


New path for researchers 


After a two-year gestation, a controversial 
update to Spain’s science law was finally 
approved by Congress on 12 May. The 
bill aims to create a research framework 
akin to those of other European countries, 
including a structured career path and an 
independent research-granting agency. The 
legislation has been a key goal of science 
minister Cristina Garmendia since she 
took office in 2008, and researchers have 
campaigned vigorously to ensure that the 
final bill protects jobs and funding. 

But the final law does not go far enough 
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for the Dignified Research campaign, 
supported by 2,500 scientists. They had 
asked the government to implement 
five-year, tenure-track contracts with 
regular evaluations that, if passed, would 
lead to a guaranteed job. Instead, the bill 
creates ‘access contracts’ for postdoctoral 
researchers that do not necessarily lead to 
a permanent position. The Confederation 
of Spanish Scientific Societies is also 
concerned that the law fails to establish 
clearly the independence of the granting 
agency. Michele Catanzaro 
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discussions until 22 June. For researchers at 2 
the institute, events are playing out like an 
unhappy soap opera and could harm their 
chances of attracting a top player. 

Barbacid, a co-discover of the first cancer- 
causing gene, or oncogene, announced in 
September 2009 that he wished to step down 
as director to concentrate on his research, once 
a successor could be found. The trouble began 
when the ministry could not offer suitable 
employment conditions for international can- 
didates. A new search started, and there were 
plans for one of the institute’s vice-directors 
to become interim director — until the small ® 
bombshell of the Experimental Therapeutics 
Programme dropped in December 2010. 

Barbacid had created the programme in 
2006, aiming to develop drug candidates based 
on chemicals that selectively target onco- 
genes. Using government loans to support the 
research, he patented several small molecules 
in October 2010. But at around the same time 
the ministry declined to extend the loans. 

Barbacid says he quickly found two private 
investors to cover the shortfall, but the minis- 
try rejected this plan in December, saying that 
its legal counsel had advised that the arrange- 
ment was incompatible with the laws governing 
research foundations. Barbacid was furious. “I 
checked with good lawyers who made sure the 
scheme was in compliance with the law,’ he says. 
“Garmendia’s lawyers are just wrong — and 
her intentions were malicious.” The ministry 
declined to comment to Nature on the row. 

On 4 March, the Spanish parliament 
approved a new law that allows public research 
foundations to attract private funding for 
research and innovation. But the ministry did 
not change its position on the private fund- 
ing for the oncogene drug programme. When 
Barbacid issued a press release on 3 May 
describing a recently published drug target, he 
took the opportunity to criticize the ministry for 
preventing him from developing the findings 
into a potential therapy. The ministry quickly 
hit back at Barbacid for “raising false hopes in 
cancer sufferers”, and for seriously breaching 
professional ethics. It then announced that his 
successor would be unveiled on 16 May. “It wasa 
visceral response to get rid of me,’ says Barbacid. 
A ministry spokesman denied there was a 
connection between the two announcements. 

Miguel Angel Piris, a former CNIO vice- 
president who left in February to take up a 
position at the University Hospital Marques de 
Valdecilla in Santander, Spain, says he “hopes 
that the clash will not harm the prestige of the 
CNIO or its director, who has managed to create 
arespected and prestigious institution”. Miguel 
Beato, director of the Centre for Genomic 
Regulation in Barcelona, adds that the affair 
underlines a weakness in Spain's science sys- 
tem. “Spain doesnt have research councils or 
authoritative research agencies who can mediate 
such rows,’ he says. “Politicians have too much 
direct influence on science here.” m 
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The honeybee is under threat from a formidable array of pathogens, including the Varroa mite seen here. 
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Geneticists bid to 
build a better bee 


Honeybee genome offers clues for fighting diseases. 


BY GWYNETH DICKEY ZAKAIB 


is a prized resource, yet he spends much 
of his time removing it. Cornman, a 
geneticist for the Bee Research Laboratory of 
the US Department of Agriculture (USDA) in 
Beltsville, Maryland, is trying to characterize 
the various pathogens that plague the honey- 
bee (Apis mellifera), arguably the world’s most 
important insect. His strategy is to subtract the 
honeybee genome from every other stray bit 
of genetic residue he can find in bee colonies, 
healthy and diseased. The remaining genetic 
material gives a complex metagenomic portrait 
of other organisms that inhabit the bee's world, 
including viruses, bacteria and fungi — some 
novel — that, alone or in combination, might 
push a bee colony into precipitous decline. 
“Right now we're in the discovery phase, 
where we're trying to identify what's present,” 
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says Cornman. “Then we can start looking at 
the interactions of pathogens and see if they're 
more virulent than any by themselves.” 
Cornman was among 100 or so research- 
ers in attendance last week at the Honey Bee 
Genomics & Biology meeting, held at Cold 
Spring Harbor Laboratory in New York. It was 
the first dedicated conference on the topic 
since researchers met four years ago, soon 
after the honeybee genome was sequenced 
(Honeybee Genome Sequencing Consortium 
Nature 443, 931-949; 2006), and for many it 
was a chance to marvel at a field transformed. 
“There has been a lot of progress made on 
how disease affects honeybees at the molecu- 
lar level,” says Christina Grozinger, director of 
Pennsylvania State University’s Center for Pol- 
linator Research in University Park, one of the 
conference organizers. Around the same time 
that the genome was first published, honey- 
bee colonies across much of the Northern 
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Hemisphere began to show alarming declines. 
A syndrome dubbed colony collapse disorder 
(CCD) has been causing the insects to die off in 
large numbers, leaving well-provisioned hives 
suddenly empty. Meanwhile, other parasites, 
such as the Varroa mite (Varroa destructor), 
which spreads harmful viruses, continue to 
take their toll. Annual surveys in the United 
States show that almost 35% of all colonies die 
during a typical winter. Genomics is yielding 
new clues to the still-mysterious phenomenon, 
as well as potential strategies for protecting the 
insects from a multitude of threats. 

At the meeting, Cornman presented data 
showing that hives affected by CCD have higher 
levels of microscopic gut fungi called Nosema, 
and a greater prevalence of several viruses, two 
of which had not been detected in bees before. 

Yet despite having a multitude of enemies, 
many bees are holding their own, says research 
entomologist Jay Evans of the USDA’s bee 
laboratory. “The question is not why are bees 
getting sick, but how are they surviving against 
this onslaught of parasites,” he says. 

The genome offers a window into the bees’ 
immune pathways, Evans adds. The goal is 
to identify the genes that are crucial in help- 
ing bees thwart attack, and, ultimately, to 
strengthen these defences. “You can breed for 
these traits, but with genetic markers you could 
do it faster,” he says. 

In cases in which nature cannot do the job, 
some researchers are now exploring more 
direct ways of boosting bees’ resilience. In 
some insects, double-stranded RNA, a hall- 
mark of viral infection, can provoke a specific 
antiviral immune response. At the meeting, 
Michelle Flenniken, a virologist at the Uni- 
versity of California, San Francisco, presented 
evidence that, in honeybees, it can also trigger 
a general immune response that might ward 
offa variety of threats. “This may be a new viral 
response that hasn't been well-characterized in 
honeybees,’ says Flenniken, who is exploring 
the genes involved in the process. “What we 
think we've found is a window into this new 
immune-response pathway.” 

Flenniken adds that knowing more about 
the bee’s immune responses might help 
researchers to find ways of “priming the sys- 
tem” and help bees to cope with their foes at 
the genomic level. Such a prospect may be a 
long way off, but it’s certain to keep researchers 
abuzz until their next gathering. = 
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NUCLEAR ENERGY 


Battle of Yucca 
Mountain rages on 


Proposed interim storage unlikely to settle US debate. 


BY JEFF TOLLEFSON 


taff have been cut, contractors laid off, 

offices closed and even furniture disposed 

of. But despite all its efforts to back away 
from plans to store spent nuclear fuel deep 
under Yucca Mountain, Nevada, the admin- 
istration of US President Barack Obama just 
can't seem to bury the idea. 

An expert commission appointed by the 
administration is looking for an alternative 
solution. On 13 May, at a public meeting 
in Washington DC, commissioners dis- 
cussed some preliminary recommendations: 
create one or more centralized facilities at 
which waste would be 
temporarily stored in 
dry casks, while engag- 
ing with the public in a 
new process to identify 
a permanent repository 
for the piles of spent 
nuclear fuel accumulat- 
ing at US reactors. But 
given the history of doubts 
about the site’s geology 
and the state-wide oppo- 
sition that has plagued 
Yucca Mountain since it 
was singled out by the US 
Congress nearly a quarter 
of a century ago, many are 
sceptical that a more pal- 
atable answer will emerge. 

“It is important that 
there will be a consensus 
recommendation, but it is 
our view that most of the issues associated with 
used nuclear fuel have been considered for a 
long time,” says Alex Flint, senior vice-presi- 
dent for governmental affairs at the Nuclear 
Energy Institute in Washington DC. The 
ongoing nuclear disaster in Japan is adding 
some urgency to the question, Flint says, but 
“it hasn't made reaching agreement any easier”. 

Meanwhile, an 8 April report from the 
Government Accountability Office (GAO), 
an independent arm of Congress, says the 
Department of Energy should “develop a pre- 
liminary plan to restart the project” at Yucca 
Mountain, anticipating that future policy shifts 
— and a pair of legal challenges from states that 
want to get rid of the waste piling up within 
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A GROWING DILEMMA 


Nuclear waste in temporary storage 
in the United States already exceeds 
the limit set for Yucca Mountain. 


their borders — may force it to do just that. 

Even after decades’ worth of research on the 
site, costing more than US$15 billion, doubts 
remain over the technical suitability of Yucca 
Mountain, given factors such as seismic activ- 
ity and water infiltration. But the politics were 
crystal clear when Obama promised to shut 
it down during his 2008 election campaign. 
In keeping with that promise, last year the 
energy department filed a motion to withdraw 
its application to store nuclear waste at Yucca 
Mountain — offering no technical or scientific 
reasons for the reversal, except to say that the 
project was “not a workable option” 

By then, however, the Nuclear Regulatory 
Commission (NRC) was 
in the midst of a regula- 
tory assessment that — 
barring the inevitable 
lawsuits — could have 
cleared the way for waste 


-related : 
eaponste shipments to Yucca 


nuclear waste 


9,000 tonnes Mountain, as directed 

estimate under a federal law signed 

by President George W. 

Bush in 2002. Now the 

mount of department's decision to 

nen ee withdraw is being chal- 

annually 2 lenged within the NRC, 

cotors in federal courts and on 
2,000 Capitol Hill. 

ine “Now that this admin- 


istration has decided to 
ignore the law, our nation 
has no long-term stor- 
age plans for radioactive 
wastes,’ lamented Repub- 
lican (Georgia) representative Paul Broun, 
chairman of the House Science Subcommit- 
tee on Investigations and Oversight, during a 
congressional hearing last week. He and other 
Republicans on the science panel, together with 
two other House committees, are challenging 
the administration's decision and demanding 
documentation. 

Their complaints have new resonance in the 
wake of Japan’s Fukushima Daiichi power plant 
disaster, in which radioactivity from nuclear 
waste stored at the plant apparently escaped 
into the environment. In the United States, 
more than 65,000 tonnes of spent nuclear 
fuel from commercial reactors currently sit in 
temporary storage, with around 2,000 more 


tonnes accumulating every year (see ‘A grow- 
ing dilemma’). Combined with waste from 
weapons programmes, the amount surpasses 
what has been set as Yucca Mountain's statu- 
tory limit, although there is room to expand 
should the site find itself back in business. 
Much of the waste resides in storage pools at 
reactor sites, like those at Fukushima. 

Additional pressure is coming from states 
that want their spent nuclear fuel moved out. 
Washington and South Carolina are leading 
challenges to the Department of Energy's deci- 
sion to withdraw from Yucca Mountain, both 
at the NRC and in the District of Columbia 
Federal Appeals Court. “Our reading of the 
law is that the issue needs to be concluded on 
the basis of its technical merits,” says Mary Sue 
Wilson, a lawyer working on the case for the 
Washington State Attorney General. 

Within the NRC itself, the Atomic Safety 
and Licensing Board ruled last year that the 
energy department does not have the legal 
authority to withdraw its application. A final 
decision on that case is now pending before the 
full commission, which is chaired by Gregory 
Jaczko, a political appointee. He is the former 

chief of staff to Senate 


> NATURE.COM majority leader Harry 
Does US nuclear Reid of Nevada, who has 
power haveafuture? spearheaded opposition 
go.nature.com/igisuj to Yucca Mountain, and 


many believe that Jaczko is stalling to prevent 
a ruling against the administration. 

“The chairman controls when the NRC 
votes, and the chairman doesnt like the cur- 
rent vote,’ says Lake Barrett, a consultant and 
former deputy director at the energy depart- 
ment. NRC officials say the commissioners are 
still deliberating on the issue. 

Speaking at the Annual Nuclear Industry 
Conference and Nuclear Supplier Expo in Wash- 


ington DCon 11 May, 

“Yucca deputy eid secre- 
° tary Daniel Poneman 
nedocien poe told industry officials 
i that the administra- 

nobody knows tion is hoping the 
how many presidential commis- 
lives have . sion will find a way to 
beenused up. reshape the discussion 


and build the kind 
of consensus that will at last allow the country 
to move forward. 

“Clearly, the mistake we made in 1987 was 
jamming it down the throat of the Nevadans,” 
says Phil Sharp, a commission member and 
president of Resources for the Future, a think 
tank based in Washington DC. Sharp says the 
government must work with the public and 
communities, presenting nuclear waste disposal 
as a national priority in a way that appeals to 
people's patriotism. 
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The commission intends to issue a draft 
report in July and a final one next January. 
With its recommendations in hand, the 
administration is expected to propose legis- 
lation that would establish a new process for 
identifying nuclear waste storage sites. 

Yet such a process could well take decades, 
the GAO report concludes, and the govern- 
ment’s reversal at Yucca Mountain could 
serve to galvanize public opposition at other 
candidate sites. Since the debate began, “no 
states have expressed an interest in hosting a 
permanent repository for this spent nuclear 
fuel ... including the states with sites currently 
storing the waste’, the report adds. The com- 
mission's scheme for an interim storage facility 
may prove no more appealing, given fears that 
‘interim’ means permanent as long as the pre- 
sent impasse continues. Such fears have in the 
past halted interim storage proposals in states 
such as Wyoming. And even if one commu- 
nity decides that it is willing to play host to the 
waste, that doesn’t mean others won't challenge 
nuclear-waste transportation routes. 

Nevertheless, the nation will need to find 
a permanent repository at some point, and 
Yucca Mountain, it seems, is down but not out. 
“Yucca Mountain has nine lives,” says Ed Davis, 
a nuclear consultant who heads the Pegasus 
Group in Washington DC. “And nobody knows 
how many lives have been used up.” m 
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America’s top 
CLIMATE COP 


The United States has abandoned comprehensive greenhouse- 
gas curbs, but California is pressing ahead. Mary Nichols is 
leading the fight against emissions. 


BY JEFF TOLLEFSON 


ary Nichols can take some pride in the view as she 
Me out of Los Angeles. The San Gabriel Mountains 
rise up to the north, framed by blue sky with just a 
touch of midday haze. The clear vista comes in large part because 
of the California Air Resources Board (CARB), the agency that 
Nichols leads, which has spent decades cleaning up the city’s air. 
Nowshe and her team are setting their sights even higher — with 
an ambitious plan to cut California's greenhouse-gas emissions. 
With an economy that outranks all but eight countries, Cali- 
fornia is a political and economic heavyweight that has never 
been afraid to flex its muscles. It is big enough make an impact, 
and now that politicians in Washington DC have abandoned 
attempts to enact a national climate law, California is forging 
ahead on its own. Nichols feels the burden of that strategy 
acutely, and she is well aware of the challenges ahead. 
In the run-up to the state elections last November, many 
feared that Californian voters would follow Washington DC’s 
lead and cast aside the state’s landmark climate legislation, 
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AB 32. The 2006 law requires a 10% reduction in greenhouse- 
gas emissions by 2020, and critics — fuelled in part by donations 
from the fossil-fuel industry — argued that the state's economy 
was too fragile to withstand aggressive new regulations. But 
voters turned out en masse to preserve the initiative, which is 
the first comprehensive climate programme in the United States. 
California has committed to reducing emissions by the same 
percentage as the European Union, and the state’s unique plan 
could chart new ground internationally. 

Since the 1970s, California has pushed the boundaries of 
environmental regulation, acting out of both pride and self- 
preservation. The state has pioneered environmental laws 
targeting air pollution, water contamination and toxic chemi- 
cals. It has advanced the sciences of atmospheric physics and 
chemistry, developed pollution-control technologies and 
bullied powerful industries into submission in an epic battle 
against choking smog in the Los Angeles basin. Other states, 
and eventually the nation, have followed California's path in 
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developing regulations to control pollution. 

But Nichols and her staff at CARB need to go even further to rein in 
greenhouse-gas emissions. The agency plans to clean up vehicle fuels, 
promote renewable energy and squeeze more reductions by improving 
energy efficiency. It is also designing the world’s most comprehensive 
carbon market, set to launch at the start of 2012. Nichols believes that 
California will one day be able to demonstrate to the rest of the country 
how environmental protection and economic growth can coexist. 

“People in this state are bullish on the ability of California to sur- 
vive and change, and they fundamentally care about air pollution and 
environmental issues,” says Nichols. “What we do here matters.” 

From Washington DC to Brussels and Beijing, government leaders 
will monitor the state's progress closely. Henry Derwent, president of 
the International Emissions Trading Association based in Geneva, Swit- 
zerland, says that California’s plans are reassuring governments around 
the world that all is not lost in the United States. “The overriding feeling 
in Europe at the government level is relief? says Derwent. “Even though 
it’s not the entire United States, it’s a pretty big consolation prize.” 


CHARM OFFENSIVE 

On this day in February, Nichols is travelling from her office in Los 
Angeles to a conference on sustainable growth at the California State 
Polytechnic University in Pomona. But the route along Interstate 10 
illustrates the scale of the problem. The greater Los Angeles urban 
area sprawls outwards through towns and cities, filled with millions of 
people who love their vehicles. 

Despite that, Los Angeles has managed to clean its air through a 
productive interplay between technology and environmental policy. 
Nichols says that modern vehicles produce 1% of the toxic pollutants 
emitted by their forerunners in 1975. The city’s population has doubled 
since then and the use of vehicles has grown at an even faster rate, yet 
the air just keeps getting cleaner. 

But CARB now faces a bigger and broader challenge. If no action is 
taken, California's emissions are projected to climb from 474 million 
metric tonnes of carbon dioxide equivalent in 2008 to 596 million met- 
ric tonnes in 2020. To reach the target set in AB 32, Nichols and CARB 
must get the total down to 427 million metric tonnes, the amount that 
the state was emitting in 1990 (see ‘Cleaning up California’). To do that, 
they need to make emissions reductions everywhere they can, and that 
is what brings Nichols to Pomona. 

She is addressing the small conference regarding one of the latest tools 
in CARB’s belt: SB 375, a 2008 law requiring the agency to set targets for 
greenhouse-gas emissions from vehicles in all metropolitan areas. Her 
team set those targets last September, and the local and regional plan- 
ning organizations must now develop strategies to meet them by, for 
example, promoting public transport, bike lanes and mixed-use zoning 
that brings amenities to people instead of forcing them to drive. 

CARB set a 13%-reduction target for the area that includes Los Ange- 
les, but many local officials complained that the state was imposing 
costly rules without providing any money to help them comply. Nichols 
knows that some of those officials are in the audience, and she has come 
in peace. As she steps up to the microphone, she gives a confident smile 
and disarms the sceptical leaders by acknowledging that the law’s future 
is in their hands. “You could probably ignore it,’ she says, scanning the 
quiet audience for a reaction. “Nothing will happen, as far as I can tell” 

Nichols then launches into a pep talk. SB 375 is not a top-down 
state solution, she says, but a bottom-up tool to help local and regional 
governments make their communities into more livable places, where 
people walk and exercise and spend more time with their families and 
less time alone in cars. This kind of master planning, she says, could set 
the stage for more organized — and less conten- 
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ist so much as a peep of protest, and by the time lunch rolls around 
conversations are focusing on how to implement the law. 

“If we see there is rising opposition, then we need to act and explain 
or make adjustments,’ says Nichols on her way back to the office. That 
kind of flexibility makes it easier for states than the federal government 
to negotiate difficult new regulations, she adds. “We are closer to the 
people that we regulate” 

If Nichols makes it look easy, she has had a lot of practice. An environ- 
mental lawyer by training, Nichols is a diehard Democrat who has 
burnished her credentials working for environmental groups. She has 
also honed her diplomatic skills in various government posts, including 
a previous stint as head of CARB, from 1979 to 1983. She eventually rose 
to assistant administrator of air and radiation at the US Environmen- 
tal Protection Agency (EPA) in 1993, under President Bill Clinton. For 


es Nichols, these political appointments 
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to put ideas into practice and put her 
stamp on the world. 

SUCCESSFUL By the time the California legisla- 

LEADER IS ture enacted AB 32 in 2006, Nichols 
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was ensconced in academia as director 
of the Institute of the Environment at 
the University of California, Los Ange- 
les. She wasn’t looking for a job when 
Republican governor Arnold Schwarzenegger asked her in 2007 to take 
over CARB and find a way to meet the target, nor was she particularly 
thrilled about going to work for a Republican film star. She jokes that 
when she met Schwarzenegger, she interviewed him for the job, and he 
passed the test. Convinced that he was genuinely interested in making 
the programme work, Nichols jumped back into government. 


IN THE DRIVER’S SEAT 

CARB’s plan bets heavily on innovation, some of which the agency is 
developing and testing at its own facilities. Nichols spends much of her 
time working from CARB’s main science laboratory in El Monte, east of 
central Los Angeles. This is where agency engineers invented the check- 
engine light in the 1980s to alert drivers to problems with their vehicle's 
pollution-control systems. CARB is now developing automated sensors 
that will allow technicians to more accurately track emissions data in 
cars using a secure onboard computer. Engineers are busy analysing 
emissions from advanced vehicles, testing the performance of hybrid 
electric cars and studying how various technologies could help the state 
to meet its 2020 goal and a further, non-binding commitment to reduce 
greenhouse-gas emissions by some 80% by mid-century. 

The most ambitious element of CARB’s plan is an overarching 
cap-and-trade programme that will cover roughly 85% of the state's 
emissions by 2015. Under that system, the state will issue a set number 
of allowances — initially for free but later through an auction — that 
companies will need to cover their greenhouse-gas emissions. The total 
number of permits will decrease each year, and companies will need 
to either reduce their emissions or buy spare allowances from other 
companies that have made reductions more cheaply. 

The cap-and-trade programme is an insurance policy. On their own, 
individual regulations for vehicle efficiency, renewable energy and other 
items will lower emissions, but they do not guarantee that the state will 
meet its targeted reductions. The cap-and-trade programme should — if 
it ever gets off the ground. In March, a California judge determined ina 
preliminary finding that CARB had failed to do a proper environmental 
analysis of the programme. The agency is now awaiting a final ruling 
on how to proceed, but CARB officials hope that the programme will 
move forward on schedule to begin next year. 

Meanwhile, the agency is pressing ahead with other bold plans. CARB 
is working with partners in Brazil and Mexico to design what would be 
the world’s first market-based programme to allow businesses to offset 
their emissions by protecting tropical forests. The agency is also estab- 
lishing another type of offset, involving ozone-depleting compounds 
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such as chlorofluorocarbons, which are powerful greenhouse gases 
not included in the United Nations’ 1997 Kyoto Protocol for reduc- 
ing greenhouse-gas concentrations. Companies in California could 
avoid reducing their emissions of carbon dioxide or other Kyoto gases 
by curbing — or paying someone else to curb — their emissions of the 
non-Kyoto greenhouse gases, which are not targeted by a more limited 
cap-and-trade scheme launched by the European Union in 2005. 

CARB is trying to avoid pitfalls revealed by the European programme. 
That scheme, for example, initially issued too many allocations, which 
led to a collapse in the price of carbon. CARB is taking care to keep an 
inventory of emissions, so that it can issue an accurate number of initial 
allowances. But the inventory is calculated in part from figures pro- 
vided by polluters, so CARB is also carrying out an independent check, 
funding scientists to measure concentrations of greenhouse gases and 
other pollutants in the field and then calculate emissions from that data. 
Already, CARB knows that methane emissions around Los Angeles are 
higher than the inventory suggests. 

Nicholas Bianco, a senior associate at the World Resources Institute in 
Washington DC who advises agencies on emissions reduction, says that 
the California cap-and-trade scheme represents a major step forward. 
“Tt will be the first of its kind in the world?” 

James Sweeney, director of the Precourt Energy Efficiency Center 
at Stanford University in California, says that what is happening in the 
state is exciting, but he has two fears. The first is that funding for energy 
and climate research will dry up in the current budget crisis, making 
the challenge of meeting long-term greenhouse-gas reduction targets 
in California and elsewhere even more difficult. The second relates to 
scale. California is important, but it represents just 7% of US emissions. 

“The bottom line,” says Sweeney, “is that if California is going to have 
a real impact it will be as the laboratory for the nation” 

The chances of that happening are unclear. Northeastern states have a 
limited cap-and-trade programme for power plants, but western states 
have backed away from joining California's scheme — although at least 
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CLEANING UP CALIFORNIA 


A state law mandates that California must shrink its greenhouse-gas emissions by 28% from the levels 
currently projected for 2020. It plans to do this through regulations targeting individual sectors, in 
combination with an overaching cap-and-trade programme, which imposes a cost on emissions. 


Reaching beyond specific regulations, an overall 
cap-and-trade programme would shave 18 million 
tonnes from the total state emissions. 
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Energy-efficiency efforts and expanded use of renewable 
energy will cut emissions associated with electricity. 
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Vehicles produce almost a third of California’s greenhouse-gas emissions. 
Reductions will come from tighter standards for vehicles and fuels. 
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three Canadian provinces are expressing interest. California isn’t big 
enough to run its own system forever, says Nichols, but the state will stay 
the course for now. She points out that the agency has history on its side. 

When CARB published its first greenhouse-gas regulations for cars 
in 2004, it quickly ran into legal battles with the automobile industry 
and the administration of President George W. Bush. But last year, the 
Obama administration brought the various players together in a deal 
that essentially established CARB’s vehicle regulations as national ones. 

“When we started the first round of greenhouse-gas standards, the 
automobile companies wouldn't even talk to us,” says Paul Hughes, who 
headed the effort as manager of the Low Emission Vehicle programme. 
Today, Hughes says, car makers are engaged at every step in the pro- 
cess as CARB and the EPA prepare to release identical new standards 
for California and the nation for model years 2017-25. Due late this 
year, those regulations are expected to translate into an average fuel- 
efficiency rating of 20-26 kilometres per litre for cars and trucks — a 
big jump from the current standard of less than 12 kilometres per litre. 
For Hughes, it is just a matter of time before other CARB policies diffuse 
outward and upward into the national scene. 

On the drive back from Pomona, Nichols ponders the roller-coaster 
progress of the past few years. With Obama in the White House, it looked 
as if the United States was finally gearing up for a serious push on global 
warming. Then lawmakers rejected the idea, leaving California on its own. 

The optimist in Nichols thinks that the United States will eventually 
find its way on climate. But she is also a realist and has a simple message 
for the rest of the country. “California set itself up to be at the head of 
what we thought was going to be a parade, but part of being a successful 
leader is having followers,’ she says. “At the end of the day, Californians 
are not going to accept a lonely role as the sole state in the union that is 
doing anything in terms of carbon.” m 


Jeff Tollefson covers energy and environment for Nature in 
Washington DC. 
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The growin 


pains of 


pluripotency 


The field of induced pluripotent stem 
cells has grown up fast. Now it is 
entering the difficult stage. 


BY ERIKA CHECK HAYDEN 


has taken for the concept of adult-cell reprogramming to 

revolutionize the field of regenerative medicine. In August 
2006, Shinya Yamanaka of Kyoto University in Japan told the 
world that he had turned mouse skin cells into induced pluripo- 
tent stem (iPS) cells, capable of becoming many types of cell’. The 
following year, he repeated the feat for human cells”. 

Like human embryonic stem cells, iPS cells could potentially 
be used as therapies, disease models or in drug screening. And 
iPS cells have clear advantages: they can be made from adult cells, 
avoiding the contentious need for a human embryo, and they can 
be derived from people with diseases to create models or even 
therapies based on a person's genetic make-up. Scientists pre- 
dicted that iPS cells would change the face of biology and medi- 
cine — and some would say they already have. In the past year 
or so, researchers have published cellular models derived from 
iPS cells for a staggering array of conditions, from heart defects’ 
to schizophrenia’. And treatments based on iPS cells are moving 
toward the clinic: in California, for example, a team hopes to gain 
approval within the next three years to start treating people with 
the devastating skin disease epidermolysis bullosa using skin 
tissue grown from their iPS cells. 

Yet work in the past few months has highlighted several poten- 
tial roadblocks. Reprogramming can be inefficient and induce 
mutations; the reprogrammed cells cannot develop into some 
cell types; and those they can generate are not always a good 
model for disease. New issues are emerging apace: work pub- 
lished last week® shows that, ina particular strain of mice, iPS 
cells cause immune reactions when they are transplanted into 
other mice with the same genetic make-up — raising questions 
about whether it will be possible to transplant iPS-cell-derived 
tissue back into the person from which it is made. No one doubts 
that iPS cells still have enormous potential, but the field’s initial 
optimism has cooled. “Right now, we are a long way from being 
sophisticated enough to take advantage of these cells’ potential,” 
says neuroscientist Arnold Kriegstein of the University of Cali- 
fornia, San Francisco (UCSF). “Things are still at a very early 
stage.’ Here, Nature looks at some of the field’s biggest challenges, 
and how they are being tackled. 


f ae years is an eye-blink in science, but that’s all the time it 


272 | NATURE | VOL 473 | 19 MAY 2011 


WV 


FINDING A RECIPE 

From the start, biologists have tried to devise safer and more 
efficient recipes for making iPS cells than Yamanaka’s method, 
which used a retrovirus to deliver a powerful shot of four genetic 
reprogramming factors into cells. Retroviruses integrate into a 
host cell’s DNA and can therefore potentially disrupt gene expres- 
sion and lead to cancer; and one of the reprogramming factors, 
Myze, is itself an oncogene that could cause cancer. 

To outsiders at least, a new, ‘improved’ reprogramming 
method seems to be published every month. But Yamanaka’s 
retroviral method is still the most efficient, and the one used 
most widely. The retroviral technique can transform about 
0.01% of human skin stem cells into pluripotent cell lines; by 
comparison, adenoviruses, which do not integrate into the 
genome, transform just 0.0001-0.0018% of cells®, and deliv- 
ering the reprogramming factors directly into a human cell 
transforms 0.001% (ref. 7). Inefficiency increases the cost and 
difficulty of deriving iPS cells for cell banks, and poses a partic- 
ular problem when working with rare cell sources. Researchers 
have also tried omitting Myc, as well as silencing it or stripping 
it from the cell once reprogramming is complete. But these 
workarounds also lower the efficiency of reprogramming, and 
a silenced Myc might be reactivated. 

Addressing these concerns is already a top priority for the field. 
Researchers continue to tinker with their reprogramming recipes, 
trying to find the factors, and means of delivery, that are the most 
efficient and don't increase the risk of cancer. In April, a group led 
by Edward Morrisey at the University of Pennsylvania in Philadel- 
phia reported that it could boost the efficiency of reprogramming 
by two orders of magnitude over standard techniques by using a 
retrovirus to shuttle in a cluster of microRNAs”. “It is very impor- 
tant for us to get these reprogramming methods to work well 
enough so that we can compare them and see whether they make 
any difference in the stability of the cells and in tumorigenicity,” 

says developmental neurobiologist Jeanne 


> NATURE.COM Loring of The Scripps Research Institute in 
Read more about La Jolla, California. “No one has done that 
iPS cells at yet, and it is going to be a long haul before 
go.nature.com/otnzrl © we figure this out.” 
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PATCHING THE SCARS 

A whole new set of questions has arisen over the past year, 
concerning the genetic impact of the reprogramming process. In 
July 2010, groups led by George Daley’ at the Children’s Hospi- 
tal Boston and by Konrad Hochedlinger"’ at the Massachusetts 
General Hospital in Cambridge published studies showing that 
iPS cells carried an ‘epigenetic memory’ — chemical modifica- 
tions in their DNA that had come from the original adult cells 
and had not been erased by the reprogramming. This, they say, 
explains why iPS cells cannot generate as many adult cell types as 
embryonic stem cells can. 

Researchers were soon reporting that iPS cells were more 
likely to contain mutations than cultured human embryonic 
stem cells. Four groups scoured the genomes of iPS cells for 
changes in single DNA bases"', DNA rearrangements called 
copy-number variations'’*” and differences in chromosome 
number". The studies found higher levels of all three. Worse, 
the mutations in iPS cells were not just inherited from the par- 
ent cells — some seemed to result from the reprogramming and 
culture process. Loring’s group reported”, for instance, that a 
protocol for differentiating iPS cells into cardiac cells selected 
for cells with genetic rearrangements. 

This picture is still coming together. One of the studies’* on 
copy-number variations found that many of the rearrangements 
disappeared after the iPS cells were cultured over long periods 
of time, probably because the most severely mutated cells were 
outcompeted by the genetically healthier ones. But this Febru- 
ary, a team led by Joseph Ecker at the Salk Institute for Biologi- 
cal Studies in La Jolla, California, reported that it had detected 
epigenetic signatures of the parent cells in human iPS cell lines 
even after they had been cultured many times and differentiated 
into specific cell types’*. A third study'® suggested that iPS cells 
are no worse than embryonic stem cells in this regard. Devel- 
opmental biologist Alexander Meissner of Harvard University 
in Cambridge and his team reported that epigenetic and genetic 
variation was similar across 20 human embryonic stem cell lines 
and 12 iPS cell lines. “What we see is not so much a lot of varia- 
tion across iPS cells, but a lot of variation across pluripotent cells,” 
Meissner says. 

Researchers expressed their concerns about such effects at a 
meeting convened by the US National Institutes of Health and the 
Food and Drug Administration (FDA) in Bethesda, Maryland, on 
21-22 March that focused on hurdles to translating research on 
pluripotent cells into the clinic. The concern is that the mutations 
could have unpredictable and undesirable effects on the cells, and 
on the patients they end up in. “The genomic changes are going 
to be a big deal to the FDA,” Loring says. 

Meissner’s group has devised a ‘scorecard’ of gene expres- 
sion and methylation — a type of epigenetic mark — that cor- 
relates with an iPS cell line’s level of pluripotency. It should help 
researchers to identify and avoid the practices that generate the 
worst genetic aberrations, and to screen for the lines that are least 
affected. And researchers are beginning to examine whether and 
how these genetic and epigenetic effects affect the capabilities and 
characteristics of iPS cells. 

“Right now there are two schools of thought on this,” Loring 
says. “One is that the sky is falling, and the other is that it’s a good 
thing that we're finding out about this now, so that we can discover 
whether these are biologically relevant changes.” 


HITTING THE LIMITS 

iPS cells are immensely flexible, but they can’t do 
everything. Liver cells derived from iPS cells could in theory 
replace animals in drug toxicology screening, for example. But 
researchers have struggled to get any human stem cells to differ- 
entiate into tissues, such as liver, that are normally derived from 
the endoderm — the most interior of the three germ layers that 
make up the embryo. This might be because the series of signals 
necessary for these cells’ development and complex function are 
difficult to recapitulate. Liver specialist Holger Willenbring at the 
UCSF points out that hepatocytes have many roles, from detoxify- 
ing the blood to making circulating proteins. “There are a lot of 
different jobs that the cell has to accomplish, and it is hard to get 
that right in cell culture,” he says. 

One of the hottest areas of research on human stem cells is in 
cell-replacement therapy for type-1 diabetes, a disease that devel- 
ops when insulin-producing cells in the pancreas are destroyed. 
But no one has been able to make a fully functional and mature 
insulin-producing pancreatic beta cell — also derived from the 
endoderm — because researchers don’t know the exact series of 
growth signals necessary and, perhaps, because beta cells usually 
develop in a three-dimensional environment that is difficult to rep- 
licate in a culture dish. Maybe this won't matter, says developmental 
biologist Matthias Hebrok at the UCSE. Beta-cell ‘progenitors’ have 
been made, from embryonic stem cells and from iPS cells, that can 
secrete insulin. They are less efficient than a normal beta cell, but 
perhaps efficient enough to help a person with diabetes. 

Researchers are making a concerted effort to devise the correct 
recipe for making mature beta cells and hepatocytes, and Wil- 
lenbring and others say that those in the field are working as 
a team on this for the first time. A paper published last week”” 
circumvented iPS cells altogether, describing the generation of 
hepatocyte-like cells directly from mouse skin cells, using a cock- 
tail of regulatory proteins important for liver development. But 
Willenbring says he still has questions about whether the cells 
were able to perform all the functions of hepatocyes. 
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MAINTAINING STANDARDS 

The relative ease of reprogramming has thrown the iPS cell 
field open to almost anyone. But from the start, researchers have 
worried that the low barrier to entry and the extremely competi- 
tive pace have meant that standards are not as rigorous as they 
should be. “There’s a tremendous amount of pressure to get these 
papers out there, and in the rush investigators are not character- 
izing their cells very well? Kriegstein says. 

Kriegstein points to a paper published last November" by a 
team led by Alysson Muotri of the University of California, San 
Diego. The team studied people with a mutated gene that causes 
the neurological condition Rett syndrome as a model for an 
autism-spectrum disorder, which causes behavioural difficul- 
ties. The researchers derived iPS cells from these patients, and 
showed that as the iPS cells differentiated into neurons, they 
initially expressed genes typically found in neural ‘precursor’ 
cells, then later expressed genes involved in neuronal signalling. 
The neurons from the patients with Rett syndrome were smaller 
than those derived from people without the disease, and they 
also had signalling defects and other differences. But Kriegstein 
says that this molecular characterization is not enough to show 
exactly what type of neurons had formed, what part of the central 
nervous system they represented and how they therefore relate 
to processes that go awry in the brain. “Possibly the reported 
abnormalities are relevant to the intended diseases,” Kriegstein 
says, referring to this and other iPS papers, but this would be very 
surprising given that most neurodevelopmental and neurodegen- 
erative diseases affect specific populations of neurons at specific 
times during development, he says. 

Muotri says that his team’s analysis of the neurons was 
thorough, and included electrophysiological tests showing that 
the cells could fire action potentials and were therefore functional. 
He also points out that almost all neurons in the brain express the 
gene, so he did not want to limit the studies just to regions that had 
been associated with Rett syndrome, as insights from the cultured 
neurons could illuminate the molecular mechanisms that disrupt 
the circuits controlling behaviour. “When we find such a cellular 
phenotype in culture, we know we can now start from there to 
understand other layers of complexity,’ Muotri says. 


MODELLING THE MIND IS HARD 

Investigators are creating patient-specific iPS cells to model 
almost every disease known — but in some cases, researchers 
question how much can be learned from the models. The biggest 
debate is over models of complex neuropsychiatric and behav- 
ioural disorders: can reprogrammed cells really mimic condi- 
tions such as schizophrenia or autism, which affect the brain 
and behaviour in complicated ways? “I’ve had clinicians ask if we 
can make iPS cells from a patient who was mentally retarded? 
says developmental biologist Christine Mummery of the Leiden 
University Medical Center in the Netherlands. But she questions 
how useful that would be. “T said, ‘I don't know, how you would 
measure the IQ ofa neuron ina dish?” 

Researchers working with these models argue that they are still 
valuable. In April, Fred Gage at the Salk Institute and his team 
reported’ that they had derived neurons from the skin cells of a 
person with schizophrenia, and that some differences between 
those neurons and normal neurons could be corrected by admin- 
istration of the antipsychotic drug loxapine. Like Muotri, Gage 
says that the model is designed to uncover how the genetic fac- 
tors underlying schizophrenia affect the function of neurons. 
“Although we cannot measure the behaviour of the patients, we 
propose that we can measure the activity of the neurons, and the 
goal is to search for cellular and molecular processes that underlie 
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the behavioural phenotypes,’ he says. 

A related problem arises with diseases of ageing, a hot field in 
iPS-cell research: many of the conditions strike mature cells, so 
iPS cells — which are essentially starting their developmental lives 
afresh — might not be relevant. With a disease such as Parkin- 
son's, says Mummery, “this is a real issue — will you be able to get 
neurons mature enough to see anything? People are working very 
hard not only to make their cell type of interest but also to make 
them mature, so that’s still a major technical obstacle” 

Some researchers counter that even ‘young’ cells show traits 
related to diseases of ageing. Renee Reijo Pera at Stanford Univer- 
sity in California made iPS cells from Genia Brin — the mother 
of Google co-founder Sergey Brin. She has Parkinson's, a condi- 
tion that is marked by the destruction of dopamine-producing 
neurons. Once differentiated into neurons, the cells secreted 
dopamine and were more sensitive to chemicals that can induce 
cell death than were dopamine-secreting neurons derived from 
healthy people’’. “This seems to be to be the best model of 
Parkinson's disease,’ Reijo Pera says. 

Many iPS-cell researchers see the field’s growing pains as signs 
that it is reaching a state of maturity; they say that the problems 
are no different from those that many biomedical research fields 
face as they inch towards clinical application. “There was this huge 
euphoria in the beginning, with everyone thinking iPS will do 
everything, cure all diseases, and be super-easy,’ Meissner says. 

“But not everyone can become a stem-cell biologist overnight,’ 
he says. “It’s a bit of a reality check that things are not as simple 
as we thought.” m 


Erika Check Hayden is a senior reporter for Nature based in San 
Francisco. 
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Atomic attacks would cause huge city fires, like this one in San Francisco in 1906, and smoke would cool the planet. 


Nuclear winter is a real 
and present danger 


Models show that even a ‘small’ nuclear war would cause catastrophic climate 
change. Such findings must inform policy, says Alan Robock. 


the possibility of a ‘nuclear winter helped 
to end the arms race between the United 
States and the Soviet Union. As former 
Soviet president Mikhail Gorbachev said in 
an interview in 2000: “Models made by Rus- 
sian and American scientists showed that a 
nuclear war would result in a nuclear winter 
that would be extremely destructive to all 
life on Earth; the knowledge of that was a 
great stimulus to us, to people of honour and 
morality, to act.” 
As a result, the number of nuclear weap- 
ons in the world started to fall, from a peak of 
about 70,000 in the 1980s to a total of about 


[: the 1980s, discussion and debate about 


22,000 today. In another five years that num- 
ber could go as low as 5,000, thanks to the 
New Strategic Arms Reduction Treaty (New 
START) between the United States and Rus- 
sia, signed on 8 April 2010. 

Yet the environmental threat of nuclear 
war has not gone away. The world faces the 
prospect of a smaller, but still catastrophic, 
nuclear conflict. There are now nine 
nuclear-weapons states. Use of a fraction of 
the global nuclear arsenal by anyone, from 
the superpowers to India versus Pakistan, 
still presents the largest potential environ- 
mental danger to the planet by humans. 

That threat is being ignored. One reason for 
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this denial is that the prospect of a nuclear war 
is so horrific on so many levels that most peo- 
ple simply look away. Two further reasons are 
myths that persist among the general public: 
that the nuclear winter theory has been dis- 
proved, and that nuclear winter is no longer 
a threat. These myths need to be debunked. 
The term ‘nuclear winter, coined by Carl 
Sagan and his colleagues in a 1983 paper’ 
in Science, describes the dramatic effects 
on the climate caused by smoke from fires 
ignited by nuclear attacks on cities and indus- 
trial areas. In the 1980s my colleagues and I 
calculated, using the best climate models 
available at the time, that if one-third of > 
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> the existing arsenal was used, there would 
be so much smoke that surface temperatures 
would plummet below freezing around the 
world for months, killing virtually all plants 
and producing worldwide famine. More peo- 
ple could die in China from starvation than 
in the nations actively bombing each other. 
As many countries around the world realized 
that a superpower nuclear war would be a 
disaster for them, they pressured the super- 
powers to end their arms race. Sagan did a 
good job of summarizing the policy impacts” 
in 1984: although weapons were continuing 
to be built, it would be suicide to use them. 
The idea of climatic catastrophe was 
fought against by those who wanted to keep 
the nuclear-weapon industry alive, or who 
supported the growth of nuclear arsenals 
politically’. Scientifically, there was no real 
debate about the concept, only about the 
details. In 1986, atmospheric researchers 
Starley Thompson and Stephen Schneider 
wrote a piece in Foreign Affairs appraising 
the theory’ and highlighting what they saw 
as the patchiness of the effect. They coined 
the term ‘nuclear autumn, noting that it 
wouldnt be ‘winter everywhere in the after- 
math of a nuclear attack. They didn’t mean 
for people to think that it would be all raking 
leaves and football games, but many mem- 
bers of the public, and some pro-nuclear 
advocates, preferred to take it that way. The 
fight over the details of the modelling caused 
a rift between Sagan and Schneider that 
never healed. When I bring up the topic of 


A DECADE OF COOLING 


nuclear winter, people invariably tell me that 
they think the theory has been disproved. 
But research continues to support the 
original concept. By 2007, models had began 
to approximate a realistic atmosphere up to 
80 kilometres above Earth’s surface, includ- 
ing the stratosphere and mesosphere. This 
enabled me, and my coauthors, to calculate 
for the first time that smoke particles would 
be heated by the Sun and lifted into the 
upper stratosphere, where they would stay 
for many years”®. So the cooling would last 
for much longer than we originally thought. 


DARK DAYS 

Many of those who do accept the nuclear- 
winter concept think that the scenario 
applies only to a mass conflict, on a scale 
no longer conceivable in the modern world. 
This is also false. A ‘small’ nuclear war 
between India and Pakistan, with each using 
50 Hiroshima-size bombs (far less than 
1% of the current arsenal), if dropped on 
megacity targets in each country would 
produce climate change unprecedented 
in recorded human history*. Five million 
tonnes of black carbon smoke would be 
emitted into the upper troposphere from the 
burning cities, and then be lofted into the 
stratosphere by the heat of the Sun. Temper- 
atures would be lower than during the ‘Little 
Ice Age’ (1400-1850), during which famine 
killed millions. For several years, growing 
seasons would be shortened by weeks in the 
mid-latitudes (see ‘A decade of cooling). 


The detonation of 100 nuclear bombs could cause fires releasing 5 million tonnes of black carbon, with 
long-term temperature effects — much greater than those from the 1991 eruption of Mount Pinatubo. 


GLOBAL TEMPERATURE WOULD DROP AFTER A NUCLEAR EVENT 


Climate 
variance 
since 1880 


Variance (°C) from 1951-80 mean 


m Effect of 5 
# megatonnes of 
black carbon 


1880 1890 1900 1910 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 2020 


CHANGE IN GLOBAL SURFACE SOLAR RADIATION 


Watts/square metre 
I 
ru 
Nh 


'e=ee Effect of Mt Pinatubo eruption 
== Effect of 5 megatonnes of smoke 


Time from event (years) 


276 | NATURE | VOL 473 | 19 MAY 2011 


© 2011 Macmillan Publishers Limited. All rights reserved 


Brian Toon at the University of Colorado 
in Boulder, Richard Turco at the Univer- 
sity of California, Los Angeles, Georgiy 
Stenchikov at Rutgers University in New 
Brunswick, New Jersey, and I, all of whom 
were pioneers in nuclear-winter research 
in the 1980s, have tried, along with our 
students, to publicize our results. We have 
published refereed journal articles, popu- 
lar pieces in Physics Today and Scientific 
American, a policy forum in Science, and 
now this article. But Foreign Affairs and 
Foreign Policy, perhaps the two most prom- 
inent foreign-policy magazines in English, 

would not even review 


“Fidel Castro articles we submitted. 
summoned We have had no luck 
me toa getting attention from 
conference the US government. 
on nuclear Toon and I visited the 
winter in US Congress and gave 
Havana.” briefings to congres- 


sional staff on the sub- 
ject two years ago, but nothing happened as 
a result. The US President’s science adviser 
John Holdren has not responded to our 
requests — in 2009 and more recently — 
for consideration of new scientific results 
in US nuclear policy. 

The only interest at a national level I have 
had was somewhat surreal: in September 
2010, Fidel Castro summoned me to a con- 
ference on nuclear winter in Havana, to help 
promote his new view that a nuclear conflict 
would bring about Armageddon. The next 
day, my talk — the entire 90 minutes includ- 
ing questions — was broadcast on nation- 
wide television in prime time, and appeared 
on the front page of the two national news- 
papers in Cuba. 

As in the 1980s, it is still too difficult for 
most people to fully grasp the consequences 
ofa nuclear conflict. But it must be grasped. 
We scientists must continue to push our 
results out to the public and to policy- 
makers, so they can in turn push political 
will in the direction of disarmament. Just as 
Gorbachev, armed with the knowledge of 
nuclear winter, helped to end the cold war, 
so too can the politicians of today use science 
to support further reductions in arms. The 
New START treaty is not enough. m 


Alan Robock is in the Department of 
Environmental Sciences, Rutgers University, 
New Brunswick, New Jersey 08901, USA. 
e-mail: robock@envsci.rutgers.edu 
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ROCHE 


Medical chemists at pharmaceutical giant Roche want to build better drug—disease models. 


Research beyond 
the recession 


Nokia, Toyota and Roche explain how they are 
weathering the financial crash, the technologies they 
are investing in and their innovation strategies. 


NOKIA 
Hard times are 
opportunities 


Leo Kdrkkdinen, distinguished 
scientist, Nokia Research Center, 
Helsinki, Finland 


Different parts of the research and develop- 
ment (R&D) chain respond to hard times in 
different ways. Agility, out of the box think- 
ing and solid R&D performance are even 
more highly valued now than in times of 
plenty. However, in a deep recession, no one 
is insulated from fiscal pressures. The first 
parts of the R&D chain, open innovation and 
long-term research, are asked to ‘fail fast?’ — 
that is, to figure out the future relevance of 
new technologies as soon as possible. For 


the development part, it is necessary to have 
solid product delivery and agility in respond- 
ing to fast-moving consumer trends to steer 
the product towards markets that are not as 
sensitive to recession pressures. 

Yet hard times are often the best times to 
invest and create new products, because they 
are precisely when the competition is less 
likely to be able to respond to the challenge. 
So cutting R&D costs under fiscal pressure 
is sometimes, paradoxically, the most expen- 
sive thing to do. 

In the next 20 years, we will see the rise 
of carbon-based electronics, using ultrathin 
graphene sheets, which will change how 
electronic components are produced and 
integrated onto chips or devices. Graphene 
is thin and transparent, 


often created by peeling NATURE.COM 
away anatomic layer of Pfizer shuts UK 
carbon froma growing _ research site: 
substrate like copper. _go.tiafure.com/x2ghpn 
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This will make it naturally flexible, enabling 
anew breed of electronic devices, embedded 
in everyday objects, and wearable like cloth. 
These flexible electronic devices will be able 
to talk to each other to provide a better user 
experience, from faster response times to 
context-sensitive behaviour. 

At the same time, manufacturing 
methods will change dramatically, driving 
down the cost of the products — we will 
see more roll-to-roll printed electronics, 
as already used for flexible displays. This 
could increase technology adoption rates 
in poorer nations, affecting global social 
development. By 2020 we can also expect 
widespread applications of artificial intel- 
ligence, self-driving cars and robots that 
interact with humans more easily. 

Nokia is driven by a passion for doing 
things, and a positive company culture is 
very important to us. We have companywide 
activities to encourage diversity in innova- 
tion. In our yearly Nokia Excellence Award, 
hundreds of well performing and successful 
innovative projects are reviewed and the best 
dozen are personally evaluated by the Nokia 
chief executive and board members ina face- 
to-face event with the inventors. 

One of the biggest challenges in innova- 
tion is prioritization: new ideas may not have 
a convincing enough business case or value 
until changes in the business environment 
make them obvious to everyone (including 
competitors). This can, of course, happen 
when it is too late to reap any rewards. One 
way we tackle this is to combine prioritiza- 
tion with crowd-sourcing of new ideas from 
our employees. In our Nokia Sphere project, 
employees can vote to work and improve on 
ideas that seem promising, driving imple- 
mentation from the bottom up. 


TOYOTA 
Improve efficiency 
of development 


Toyota Motor Corporation, Aichi, 
Japan 


We have been working in a very difficult busi- 
ness environment since the recession began 
in 2009 (see ‘Corporate changes’). But we 
are still investing more than 700 billion yen 
(US$8.7 billion) a year in R&D. We want to 
keep our competitive edge in technology and 
products, so we are maintaining high levels of 
investment to develop advanced technology 
related to safety, the environment and energy. 
We've used the recession as an opportunity 
to work with our suppliers to improve our 
development efficiency and we hope to take 
advantage of this in the future. 

At the moment, there is no clear > 
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> alternative to petrol as an energy source 
for cars. So we are developing a wide range 
of products based on hybrid-vehicle tech- 
nology, combining an electric motor and a 
petrol engine. Our approach is to develop 
the best cars for the consumer in each dif- 
ferent market. 

We currently have a strong focus on bat- 
teries for future electric vehicles. Although 
lithium-ion batteries are becoming more 
widely used, it is hard to see electric vehicles 
completely replacing conventional passen- 
ger cars, even if we push the performance of 
lithium-ion batteries to the limits. We have 
to solve problems of energy storage density 
and cost. We are researching and developing 
all-solid and metal-—air batteries, which are 
two promising alternatives to lithium-ion. 

Another possible game-changing technol- 
ogy is solar power. More and more house- 
holds are using solar cells. At the moment, 
some of our hybrid Prius cars have solar-pow- 
ered ventilation systems that operate while 
the car is parked, but it may also be possible 
to use solar power to drive the vehicle if we 
can achieve a breakthrough in the efficiency 
of generating electricity from solar energy. 

In the long term, we believe that the use 
of vehicle telematics will revolutionize the 
car industry. We are seeing rapid develop- 
ment and innovation in automated driving 
and accident prevention. As vehicle-control 
technology advances, more cars may be able 
to avoid collisions. Then it may become 
possible to change vehicle structures and 
make cars much lighter. That will in itself 
reduce energy usage. 

The Japanese idea of monozukuri, which 
could be translated as making things, is at 


CORPORATE CHANGES 


the heart of Toyota's approach. We think that 
new ideas are created by digging into the root 
causes of problems and by finding out facts 
through genchi genbutsu, which means actu- 
ally going to a site and discovering the real 
situation for yourself. It is important that we 
nurture our employees to take this practice to 
heart. For the past 50 years, this approach has 
been the driving force behind the innovation 
and originality in our development processes. 


ROCHE 
Collaborate with 
the public sector 


Jean Jacques Garaud, global head 
of pharma research and early 
development, Roche Holding, Basel, 
Switzerland 


The recession is diminishing the funding 
available for research at publicly funded sci- 
entific institutions. This compels them to be 
more open to, and more collaborative in, pub- 
lic-private partnerships. Since the integra- 
tion into Roche of Genentech, a Californian 
biotechnology company, in 2009, Roche has 
operated two autonomous Research and Early 
Development units, pRED and gRED, with 
distinctive approaches. In the first 18 months 
of pRED, we've developed and driven exter- 
nal collaborations, ranging from relationships 
with individual academics to entire networks 
with leading academic and health institutions. 

At the same time the economic crisis 
increases the pressure on drug prices and 


In 2009, corporate research and development (R&D) spending declined for the first year in more than a 
decade (see graph), according to a study of 1,000 of the world’s most research-intensive companies by 


New York analysts Booz & Company. 


Total R&D spending in 2009 dropped 
3.5%, but revenues fell more sharply, by 
11%. So R&D is still one of the last places 
that corporations make cuts. About half of 


the 1,000 firms cut their R&D portfolio in 
2009, but nearly all the cuts came in three 
industries: car manufacturers, computing 
and electronics. 


R&D AND SALES 
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Research spending in the health-care sector grew by a 
modest 1.5% in 2009, as reflected in the rankings of the 


top spenders (see table). Toyota Motor Corporation and 
Nokia both dropped, while Roche Holding climbed two 
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forces us to home in on drug candidates 
that will add value from a medical and 
public-health standpoint. We are focusing 
efforts on personalized health care, because 
patients with the same condition can react 
to the same treatment in different ways — 
and sometimes even receive treatment that 
is inappropriate for them. To better fit the 
treatment to the patient, we must concen- 
trate on better understanding the molecular 
basis of diseases and their heterogeneity. 

I'm optimistic that these recessionary 
challenges can be turned into opportunities 
to make health care better, safer and more 
effective. 

Our ultimate goal is to understand the 
biology of diseases and translate this knowl- 
edge into the clinic. New technologies that 
will help include cell-penetrating peptides 
that may allow the delivery of drugs into cells 
as well as therapeutic interactions on the cell 
surface. For peptides in general, we will need 
to develop synthesis methods to overcome 
difficulties, such as structural instability, that 
can weaken peptide interaction with targets 
and reduce activity and specificity. 

Stem cells will also be increasingly impor- 
tant as translational-research tools. With 
differentiated cells derived from stem cells, 
we are able to study the effects of drug com- 
pounds on clinically relevant targets and 
observe cellular functions at an early stage. 

Finally, computer modelling and simula- 
tion could also be game changers, if we can 
build more reliable drug—disease models to 
better design experiments and predict their 
outcome. 

To encourage such innovation, Roche 
fosters an environment that allows our scien- 
tists to grow and experiment with new ideas 
and approaches. One way to do that is to talk 
about science itself, not just about managing 
science. We have launched a ‘barn initiative’ to 
provide informal environments for kindling 
creativity in settings from campuses and cas- 
tles to converted barns. At these ‘barns; away 
from their day-to-day projects, scientists can 
engage in positive and challenging scientific 
discussions on a specific theme. 

Itis also important to provide the recogni- 
tion and the rewards that scientists deserve. 
Our publication strategy explicitly encour- 
ages publishing in scientific journals and we 
advocate the exchange of ideas at scientific 
conferences. m 


CORRECTION 

In the Comment article ‘The art of 
conservation’ (Nature 472, 287-289; 
2011), the 1964 Durrell Wildlife 
Conservation Trust logo and the 1961 
Friends of the Earth International logo 
were actually from 1999 and the 1970s, 
respectively. 
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Better understanding of decision-making processes in the brain might predict which perpetrators will offend again. 


My brain made me do it 


Adam Kepecs urges caution in considering the unconscious mind in the justice system. 


surprising view has been gathering 
Avenestin in neuroscience: most of 

our thoughts and actions are driven 
by unconscious brain processes that are 
hidden from conscious introspection. So if 
consciousness is rarely in the driver’s seat, 
and if we cannot choose our genes or the 
childhood experiences whose interactions 
form our brains, then are we responsible for 
our actions? 

In Incognito, accomplished neurosci- 
entist David Eagleman — author of the 
best-selling short-story collection Sum 
(Canongate, 2010) — examines this gap 
between our conscious and unconscious 
selves. He offers a whirlwind of stories, from 
visual illusions and sleep-walking killers to 
ovulating strippers, all carefully chosen to 
drive home his main point that our brains 
“neurally preordain” us to make decisions. 
As is common in books aimed at a general 
readership, the intriguing and sometimes 
bizarre case studies create a tension between 
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journalistic musings and more detailed 
arguments. Although specialists may feel 
that the balance tilts toward the journalistic, 
Eagleman’s expertise comes through. 

Since Sigmund Freud’s famous psycho- 
logical framing of the unconscious in the 
late nineteenth century, modern neuro- 
science has shown that 
most processing in the 
brain is unconscious. 
We are unaware of 
routine processes and 
have little insight into 
our choices and pref- 
erences. For instance, 
men unknowingly 
prefer photographs of 
women with dilated 
pupils, presumably 
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2011.304pp/272pp. recognize pupil dila- 
$26.95/£20 tion as an indicator 


© 2011 Macmillan Publishers Limited. All rights reserved 


of sexual arousal. In another experiment, 
people’ descriptions of the strategies they 
used to make simple economic decisions 
differed from the rules that they actu- 
ally used, suggesting that their conscious 
explanations were formed post hoc and 
without access to their decision-making 
process. Through such examples, Eagleman 
demonstrates that unconscious processes 
can be clever, adaptive and even outperform 
the best computer algorithms. 

If our brains can carry out such amaz- 
ing feats without us knowing, why have 
consciousness at all? Eagleman answers this 
question with a metaphor. Consciousness, 
he says, is like the chief executive of a large 
company. He or she has little knowledge of 
the day-to-day opera- 
tions, yet is indis- 


pensable for setting See Nature's special 
goals and arbitrating _ issue on Science in 
between conflicting Court: 


departments. Similarly, 


FROM THE 1949 FILM A RUN FOR YOUR MONEY. COURTESY RONALD GRANT ARCHIVE 


consciousness gets only the abridged, 
delayed and sometimes contradictory 
reports from neural subroutines. And, 
much like a chief executive trying to explain 
him- or herself to the board of directors, 
consciousness will “fabricate stories to 
explain the sometimes inexplicable dynam- 
ics of subsystems in the brain”. 

Having described the hidden life of our 
brain circuits, Eagleman moves to an original 
and provocative discussion of the legal con- 
sequences of the unconscious decider within 
us. Imagine two defendants on trial for mur- 
der: one has a large brain tumour next to an 
area associated with aggression, whereas the 
other one shows no obvious change in his 
brain. Most people would not hold the first 
defendant responsible for his actions. Eagle- 
man argues that as we gain a better under- 
standing of the biology of decision-making, 
we will be forced to conclude that all crime is 
caused by faulty brain circuits arising from 
genetic and environmental interactions over 
which the perpetrator has no control. 

An improved understanding of how 
subtle changes in the brain generate devi- 
ant behaviour would therefore extend the 
insanity defence — ‘my brain made me do 
it. Eagleman suggests that a forward-looking 
legal system should consider biological infor- 
mation to predict how likely a person is to 
commit a crime again, and take this into 
account for sentencing. As most criminals 
commit offences because they are unable to 
inhibit their impulses, Eagleman proposes 
that rehabilitative “prefrontal workouts’, 
aimed at improving self-control, should be a 
mainstay of the justice system. Crime would 
still land you in jail, but the focus would be on 
protecting society, not on punishment. 

My feeling is that we need to be extremely 
cautious in advancing such a brain-centric 
legal system. A world in which judges are 
instructed to consider the genetics and 
neural make-up of defendants, as Eagleman 
advocates, evokes Phillip K. Dick’s short 
story The Minority Report. If sentencing 
decisions consider the biological likelihood 
of recommitting a crime, it is easy to imag- 
ine the next step of considering preventive 
measures before a crime has been commit- 
ted — a kind of ‘Department of Precrime. 

Whether or not one agrees with Eagleman, 
discussions about these difficult issues at the 
intersection of neuroscience and society are 
essential and timely. He should be lauded for 
his clear exposition of the consequences of 
our emerging understanding of the brain. 
Incognito is a smart, captivating book that 
will give you a prefrontal workout. = 


Adam Kepecs is assistant professor of 
neuroscience at Cold Spring Harbor 
Laboratory, 1 Bungtown Road, Cold Spring 
Harbor, New York 11724, USA. 

e-mail: kepecs@cshl.edu 


Books in brief 


House on Fire: The Fight to Eradicate Smallpox 
William H. Foege UNIVERSITY OF CALIFORNIA PRESS 240 pp. 
$29.95 (2011) 
Adding to the series of California/Milbank Books on Health and 
the Public, this part-memoir, part-history by epidemiologist 
William Foege recounts his involvement in the global vaccination 
programmes that eradicated smallpox in the 1960s and 1970s. 
Foege, now a senior fellow at the Bill & Melinda Gates Foundation 
in Seattle, Washington, reflects on the strategies that led to wide 
uptake of the vaccines across Africa and India and discusses their 
successes and vulnerabilities. 


ERADICATE SMALLPOX 


A Perfect Moral Storm: The Ethical Tragedy of Climate Change 
Stephen M. Gardiner OXFORD UNIVERSITY PRESS 512 pp. 

£22.50 (2011) 

Inaction on climate change is more than a political or explanatory 
bungle — it is a moral failure, declares philosopher Stephen 
Gardiner. He identifies three reasons for this: the global imbalance of 
power, such that rich nations dominate poor ones; intergenerational 
factors, such that present generations dictate the world that future 
ones will inhabit; and our inability to make predictions using current 
scientific knowledge. A ‘perfect storm’ of these three factors creates 
an ethical headache, Gardiner contends. 


Redesigning Leadership: Design, Technology, Business, Life 
John Maeda with Becky Bermont MIT PREss 104 pp. $18 (2011) 
Celebrated designer and computer scientist John Maeda shares 
his thoughts on leadership in this concise volume. Interspersing 
his musings with philosophical tweets, he reflects on how he has 
sought out imaginative ways to run teams and organizations, from 
the Media Lab at the Massachusetts Institute of Technology in 
Cambridge to the Rhode Island School of Design in Providence. He 
describes how to make meetings run faster and be more fun, the 
team-building benefits of free food and how to harness conflicting 
opinions within a group of creatives. 


An Empire of Ice: Scott, Shackleton, and the Heroic Age of 
Antarctic Science 

Edward J. Larson YALE UNIVERSITY PRESS 360 pp. £18.99 (2011) 
Looking at the broader context of the race to be first to reach the 
South Pole in the early twentieth century, historian Edward Larson 
celebrates the centenary of these explorers’ achievements. It was 
the greater scientific ambition of the British Antarctic expeditions, 
he argues, that caused Robert Scott and Ernest Shackleton to be 
slower than their more narrowly focused Norwegian rival, Roald 
Amundsen, who made it to the South Pole 35 days before Scott 
and his ill-fated team. 


Engineering Animals: How Life Works 

Mark Denny and Alan McFadzean BELKNAP PRESS/HARVARD UNIVERSITY 
Press 400 pp. $35 (2011) 

From soaring albatrosses to croaking bullfrogs, different creatures 
exploit various aspects of engineering to help them fly, hunt or 
communicate. In a clear and well-illustrated account, former 
aerospace engineers Mark Denny and Alan McFadzean describe 
the principles of physics that underlie animals’ sense of smell, 
their use of sonar, and how they flock, signal to each other and 
consume energy. 
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Sergei Korolyov (played by Darrell D’Silva) was celebrated after the success of the Sputnik satellite. 


The chief designer 


A clever play shows how engineer Sergei Korolyov 
drove the Soviet space programme, finds Philip Ball. 


former Soviet military—industrial com- 

plex. Fifty years ago, the cosmonaut Yuri 
Gagarin became the first person in space, 
orbiting the world for 108 minutes in the 
Vostok 1 spacecraft. And 25 years ago, reac- 
tor four at the Chernobyl nuclear plant in 
Ukraine exploded, sending a cloud of radio- 
active debris across northern Europe. 

One triumph, one failure. Rona Munro's 
play Little Eagles, commissioned by the 
Royal Shakespeare Company for the Gaga- 
rin anniversary, understandably makes no 
mention of the later disaster, but connections 
assert themselves throughout. 

Both events were fruits of the cold war 
nuclear age. The rockets made by Sergei 
Korolyov — chief architect of the Soviet space 
programme and the play’s central character 
— armed President Nikita Khrushchev with 
intercontinental ballistic missiles before 
they took Gagarin to the stars. But the space 
programme degenerated along the same 
bureaucratic lines that 
made an exclusion 
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both technologies 
fatally — most notably 
in Little Eagles in the crash of Soyuz 1 in 1967, 
which killed cosmonaut Vladimir Komarov. 
Gagarin was listed as the backup pilot for that 
mission, but was by then too valuable a trophy 
to be risked in another space flight. All the 
same, Gagarin died a year later, during the 
routine training flight ofa jet fighter. 

Callous disregard for life marks Munro's 
play throughout. We first see Korolyov, a 
military rocket engineer, in the Siberian 
labour camp where he was sent during 
Stalin's purge of the officer class just before 
the Second World War. As the Soviets devel- 
oped their rocket programme, the stupidity 
of sending someone so brilliant to a virtual 
death sentence dawned on the regime, and 
Korolyov was freed. During the 1950s, he 
gained control of the whole space enterprise, 
becoming known as the Chief Designer. 

Munro’ portrayal of Korolyov seems 
accurate, if the testimony of one of his chief 
scientists is anything to go by: “He was a king, 
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a strong-willed purposeful person who knew 
exactly what he wanted ... he swore at you, 
but he never insulted you. The truth is, every- 
body loved him?’ Portrayed in the Hamp- 
stead Theatre production by Darrell D’Silva, 
he is a swaggering, cunning, charming force 
of nature, playing the system so that he might 
realize his dream of reaching for the stars. He 
reciprocates the love of his ‘little eagles, the 
cosmonauts chosen with an eye on the Vostok 
capsule’s height restrictions. (Even today the 
Soyuz capsule excludes tall cosmonauts.) 

But for the leaders of the Soviet Union, 
rocketry was merely weaponry, a way of 
demonstrating superiority over their foes in 
the West. Korolyov becomes a hero for beat- 
ing the Americans with Sputnik, and then 
for Vostok 1. But when the thuggish, foul- 
mouthed Khrushchev (played by a terrifying 
Brian Doherty) is retired in 1964 in favour of 
Leonid Brezhnev, the game changes. Brezh- 
nev sees no virtue in Korolyov’s dream of a 
Mars mission, and worries instead that the 
Americans will beat them to the Moon. The 
rushed and bungled Soyuz 1, launched after 
Korolyov’s death in 1966, was the result. 

Out of this fascinating but chewy material, 
Munro has woven a moving and often beau- 
tiful tale. Gagarin’s own story is a subplot. 
Grounded as a toy of the Politburo, we see 
his sad descent into the vodka bottle but not 
his ignominious end — that is too much to 
be shoehorned into this packed play. Never- 
theless, it is a satisfying and wise production. 

The play might be regarded as a foil to The 
Right Stuff, Tom Wolfe’s 1979 account of the 
US space programme, which was made into 
an exhilarating film in 1983. Wolfe’s celebra- 
tion was a fitting tribute to the courage and 
ingenuity that took humans to the Moon, but 
a glimpse at the other side of the coin was 
long overdue. There is something awesome 
as well as awful in the grinding resolve of the 
Soviets to win the space race relying on just 
the chief engineer, “convicts and some uni- 
versity students’, as Munro puts it. 

Little Eagles shows us the mix of noble and 
ignoble impulses in the space race that the 
US programme, with its Columbus rheto- 
ric, still cannot afford to acknowledge. The 
play recognizes the glory of seeing the stars 
and Earth from beyond the atmosphere, but 
reveals the human space-flight programmes 
as a product of their tense times, as national- 
istic black holes for dollars and roubles (and 
now yuan too). And, like Chernobyl, such 
politically motivated displays are open to 
mistakes not from excessive ambition but 
from fear of failure. 

Crucially, Munro leaves the final judge- 
ment to the audience. “They say you 
changed the whole sky and everything under 
it,” Korolyov’s physician says to him in the 
final lines. “What does that mean?” m 


Philip Ball is a writer based in London. 


D. COOPER/PHOTOSTAGE 


COURTESY OF CHARLES JENCKS 


Cosmology is a strong theme within the garden of landscape designer Charles Jencks. 


Q&A Charles Jencks 
Cosmic gardener 


Charles Jencks designs landscapes and sculptures to convey concepts in astronomy, biology and 
mathematics — notably at CERN, Europe’s particle-physics lab near Geneva, Switzerland, and 
in his Garden of Cosmic Speculation near Dumfries in Scotland, UK. On the launch of his new 
book, he discusses green architecture and metaphor. 


Why use landforms and landscapes to 
express scientific ideas? 

My book The Universe in the Landscape 
describes my designs at all scales, from 
small gardens to a restored open-cast coal 
mine. They mix different media — archi- 
tecture, sculpture, planting and epigraphy 
— to interpret basic ideas of the cosmos. I 
see turf mounds as a medium through which 
we can interpret a larger cosmic nature. 
This endeavour parallels those in prehis- 
tory, when people made landforms such as 
the stone circles of Brodgar in the Orkney 
Islands and Stonehenge near Salisbury, UK. 


How did you get involved with CERN? 

CERN’s director-general, Rolf Heuer, and 
his team asked me to collaborate on a pro- 
ject at the centre of the Large Hadron Col- 
lider. They had built a wooden dome, which 
they call the Globe of Science and Innova- 
tion, and J argued that they should make a 


The Universe in 
the Landscape: 


landscape around it. 
I talked to many sci- 


Landforms by entists, including the 
Charles Jencks Mart 

CHARLES JENCKS astronomers Martin 
Frances Lincoln: 2011, Rees and Bernard 
256 pp. £40 Carr, who suggested 


using uroboros, the 
alchemical symbol of a snake eating its tail. 
One goal of science is to reconcile relativity 
and quantum theory. No one knows which 
will swallow which: will the large explain the 
small, the small explain the large, or neither? 


What will the CERN landscape look like? 

As an overall strategy, a connected landscape 
will protect what little green land remains 
around the Globe. I have proposed a design 
in the shape of a modified uroboros — a ring 
that connects steps in the relative scale of 
objects in the Universe, with a question mark 
for the snake’s head. CERN’s particle colli- 
sions have become a sculptural icon used for 
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architectural details: in the form of an eye, 
with rays exploding outwards from its central 
point. The construction could be finished in 
three years. 


How does your Garden of Cosmic 
Speculation explore cosmology? 

There is a ‘Universe cascade’ where water 
runs down and time runs up, in a series 
of stone steps that show the history of the 
Universe. As you walk up the steps, jump- 
ing from platform to platform, you see the 
Universe slowly unfolding from its origins 
in the mists of the quantum soup. Mathe- 
matical physicist Roger Penrose asked me, 
“How could you build superstrings in con- 
crete when they are so clearly going to be 
proven wrong in ten years?” But you enjoy 
the play of uncertainty in a garden. You can 
revise your work when you make mistakes. 
Besides, they date the design precisely at the 
point we believed these theories. 


What was your intention in building it? 

The garden is a project started in 1988 with 
my late wife, Maggie. We tried to translate 
some of the metaphors of science into land- 
scape design. There isa DNA garden with 
six cells whose walls are made up of vari- 
ous plants. At the centre of each cell is the 
nucleus, and at the centre of the nucleus are 
six versions of the double helix, unfolding 
with RNA coming out to be read by plants 
that represent ribosomes. It is a critical look 
at the relationship between DNA and the cell. 


How is science changing architecture? 
Recent attempts to pull together complexity 
theory and architecture have led to an enor- 
mous amount of design, but not much of it 
has been built or is convincing. However, 
computers are transforming architecture. 
Architect Frank Gehry uses software writ- 
ten by French aerospace engineers to design 
curved buildings. And Zaha Hadid and her 
students in London are putting forward 
‘parametric architecture, which uses rules to 
generate appropriate designs. 


How do you see the role of architects in 
future? 

Every time an architect designs a building, 
they are predicting what will be relevant for 
the world in the coming decades. They must 
also persuade clients and society. Foster and 
Partners, one of the greenest architecture 
firms on the planet, got the Reichstag in Ber- 
lin running on vegetable oil. But they also 
build expensive skyscrapers. The problem for 
the profession is that architects do not con- 
trol enough of the building process to lead the 
green agenda. But they should try. You fight 
the right battles even if you do not expect to 
win the war. = 


INTERVIEW BY JASCHA HOFFMAN 
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Guidelines for HIV 
in court cases 


In many nations it is a crime to 
infect someone with HIV by 
intention or non-disclosure. 

As phylogenetic experts who 
advise courts worldwide, we are 
calling for guidelines on how 
phylogenetics should be used 
in criminal HIV investigations. 
The inappropriate use of 

such evidence in suspected 
transmission cases can have dire 
legal and social ramifications. 

The scientist's job is not to 
argue for or against a defendant's 
guilt: that is a task for lawyers. 
Phylogenetic investigators 
should limit themselves to 
an expert opinion on what 
information about viral 
transmission can be deduced 
from their analysis. This must be 
derived impartially, for example 
by blinding the identities of case 
subjects. 

Scientists must explain to 
courts that phylogenetic analysis 
cannot ‘prove’ any particular 
hypothesis, such as ‘person 
A infected person B’ Rather, 
results may be compatible with 
several hypotheses, or support 
one over another. 

Ana priori hypothesis should 
be formulated by different 
independent epidemiological 
experts, based on contact 
possibilities between the 
purported victim(s) and the 
defendant, and on any additional 
contacts or risk factors. 

Phylogenetic analysis alone 
cannot exclude the possibility 
that HIV was transmitted from 
A to B through unsampled 
persons. Although the direction 
of viral transmission can 
sometimes be supported, it does 
not prove direct transmission. 
Thomas Leitner on behalf of 
8 co-authors*, Los Alamos National 
Laboratory, New Mexico, USA. 
tkl@lanl.gov 
*A full list of co-authors is 
available online at 
http://dx.doi.org/10.1038/473284a. 
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PhDs: what’s left if 
science abdicates? 


I disagree that we have a glut of 
scientists with PhDs 
(www.nature.com/phdfuture). 
The corporate view of PhD 
numbers in terms of what the 
market will bear ignores the 
major problems that only science 
can solve in the coming century. 
The list is long: natural 
disasters, such as earthquakes 
and incoming celestial objects; 
environmental degradation; 
sustainable energy; famine and 
violence; untreatable medical 
conditions; and threats such as 
antibiotic resistance. If science 
abdicates, there is nothing else. 
The urgency of these problems 
requires a large cadre of trained 
individuals to be enlisted to 
defend our planet. The size of 
the military is dictated by our 
defence needs, not the market. 
In science, by analogy, our global 
defence needs are soaring. 
Spending a few years in the 
service of science and the greater 
good, being rewarded with 
an advanced degree and, for 
example, going on to teach in high 
schools is an honourable fix. 
Kenneth S. Kosik University of 
California, Santa Barbara, USA. 
kosik@lifesci.ucsb.edu 


> NATURE.COM 
Join the debate on the future of the PhD: 
go.nature.com/phdfuture 
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PhDs: Israel also 
trains plenty 


You contend that few PhDs 
are trained in the Middle East 
outside Egypt (Nature 472, 
276-279; 2011). Israel is a 
sizeable contributor as well. 

In 2008-09, Israel had more 
than 10,000 students enrolled in 
doctoral programmes (Central 
Bureau of Statistics, Israel). This 
is fewer than Egypt's 35,000 for 
the same period, but many more 
per capita. 

Given ongoing tensions in 
the region, the scientific press 
has a responsibility to report 
data related to higher education 
and research transparently and 
accurately (see also Nature 471, 
37; 2011). 

Thomas Hays Mount Sinai School 
of Medicine, New York, USA. 
thomas.hays®mssm.edu 


Crop failure signals 
biodiversity crisis 


Crop failures have pushed up 
food prices globally (Nature 472, 
169; 2011). Human well-being 
depends on biodiversity and 
natural habitats as a source of 
food. Ironically, the countries 
harbouring these vital natural 
assets are also those currently 
facing the most severe food 
crises. 

A report from the investment 
bank Nomura (go.nature.com/ 
pwrlc9) introduces a global 
index for measuring nations’ 
food vulnerability. The most 
vulnerable depend totally on 
imported food, and citizens 
spend more than one-third of 
their salaries on it. 

Of the 35 most vulnerable 
countries, 15 contain tropical 
biodiversity hotspots. To produce 
more food, these countries may 
lease out their biodiversity-rich 
land to farm cash crops. Liberia, 
for example, intends to add 
220,000 hectares of oil-palm 
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plantation (go.nature.com/ 
xblcjz) to its existing 1.6 million 
hectares of agricultural land in 
the southeast, one of the last 
strongholds of tropical forest in 
western Africa. 

Vulnerable nations need 
better cooperation among 
governments to address the 
structural causes of imbalances 
in the international agricultural 
system; more research into new 
technologies that incorporate 
the food-production 
requirements of the rural 
poor; and stronger protection 
of natural systems by linking 
biodiversity preservation to 
increased food security. 

Kelvin S.-H. Peh University of 
Cambridge, UK. 
kelvin.peh@gmail.com 


China must reduce 
fertilizer use too 


Environmental damage caused 
by reactive nitrogen is not just a 
European problem (Nature 472, 
159-161; 2011). China must also 
rein in its overuse of nitrogen 
fertilizers — which accounts 
for 40% of global production 
since 2006 — to balance food- 
security requirements with the 
protection of human health and 
the environment. 

Despite China's nitrogen 
consumption almost doubling 
between 1990 and 2009, its 
grain production increased 
by just 22%. Although the 
research community widely 
recognizes the problem of 
fertilizer overuse, farmers in 
China continue the practice, 
which is promoted by some 
agricultural-extension advisers 
and by sellers of fertilizer. 

Chinese farmers need to 
be taught how, when and 
in what quantities fertilizer 
should be applied. The existing 
agricultural-extension system 
must revert to its role of assisting 
farmers by methods other 
than promoting fertilizer sales. 


OLIVER MUNDAY 


Establishing an environmental- 
extension system at the township 
level could also help to prevent 
overuse of fertilizers and 
pesticides. 

Peng Gong, Lu Liang and 
Qiang Zhang Institute for 

Global Change Studies, Tsinghua 
University, China. 
penggong@tsinghua.edu.cn 


Data archivingisa 
good investment 


We have found that ongoing 
financial investment in data- 
archiving infrastructure yields an 
impressive scientific return, and 
believe that it should be whole- 
heartedly supported by research 
funding agencies (see, for 
example, go.nature.com/nzftf3). 

We used Dryad (see http:// 
datadryad.org), an international, 
open, cost-effective data 
repository for the biological 
sciences, to estimate the cost 
of archiving data from more than 
10,000 publications. We found 
that these could be curated and 
the data preserved at an annual 
cost of about US$400,000. 

As an example of how much 
research is typically published 
per grant dollar, core grants in 
population and community 
ecology from the US National 
Science Foundation averaged 
3-4 publications per $100,000 
of grant between 2000 and 
2005 (S. Reyes, A. Tessier and 
S. Mazer, unpublished results). 
That is, $400,000 invested in 
original research resulted in 
about 16 papers. 

Dryad cannot yet tell us how 
effective data archives are in 
facilitating primary research 
publications, but the Gene 
Expression Omnibus (GEO) 
database at the US National 
Center for Biotechnology 
Information offers some insight. 
To estimate data reuse, we 
searched the full text of articles 
in PubMed Central for mention 
of any of the 2,711 data sets 
deposited in GEO in 2007. 

We excluded articles whose 
authors’ names overlapped 
with those depositing the data 
set. Extrapolating the 338 hits 
in PubMed Central to all of 
PubMed, we estimate that the 


GEO 2007 data sets made 
third-party contributions to more 
than 1,150 published articles 

by the end of 2010, and reuse 
continues to accumulate rapidly 
(H. A. Piwowar, T. J. Vision and 
M.C. Whitlock Dryad Digital 
Repository doi:10.5061/dryad. 
jlfd7; 2011). 

Assuming that Dryad has a 
comparable rate of reuse and 
collects at least 2,500 data sets 
annually, an investment of 
$400,000 in one year should 
contribute to more than 1,000 
papers in the next four years — 
far more than the accepted value 
for a research dollar. 

Heather A. Piwowar Dryad, 
and the National Evolutionary 
Synthesis Center, Durham, North 
Carolina, USA. 
hpiwowar@nescent.org 

Todd J. Vision Dryad, and the 
University of North Carolina, 
Chapel Hill, North Carolina, USA. 
Michael C. Whitlock 

Dryad, and the University of 
British Columbia, Vancouver, 
Canada. 


Noisy oil exploration 
disrupts marine life 


Fossil-fuel operations in 
the Arctic will inevitably 
compromise habitat — regardless 
of spills (Nature 472, 163; 2011). 
The seismic airgun surveys 
used for hydrocarbon 
exploration and for monitoring 
deposit conditions can disrupt 
foraging behaviour of bowhead 
whales at long distances. They 
also seriously diminish fisheries 
catches of haddock and other 
Arctic species, and have halted 
the migration of fin whales (a 
non-Arctic species) at a range of 
more than 175 kilometres. 
Bowhead and beluga whales 
avoid oil-derrick operations. 
Many other noises associated 
with fossil-fuel exploration 
and production — such 
as construction, shipping, 
transport helicopters and 
underwater acoustic telemetry 
— have a deleterious impact 
on the marine acoustic 
environment. 
We do not yet know about 
the impacts of noise from 
thruster-stabilized exploration 


CORRESPONDENCE Meu) 


platforms and from sea-floor 
processing equipment such as 
wellhead chokes, separators 
and re-injectors, which operate 
out of sight and under extreme 
pressures. 

In collaboration with the 
World Wildlife Fund and the 
Natural Resources Defense 
Council, we are developing a 
peer-reviewed website that can 
be understood by a lay audience 
in order to explore some of 
these issues (see go.nature. 
com/5vuebe). 

Michael Stocker Ocean 
Conservation Research, 
Lagunitas, California, USA. 
mstocker@msa-design.com 


Address education 
inequality in India 


Narrowing the educational 
achievement gap between 
different social groups in India 
remains a major challenge, 
despite 60 years of affirmative- 
action policy (Nature 472, 
24-26; 2011). Using publicly 
available data from the country’s 
top medical school, the All 
India Institute of Medical 
Sciences (AIIMS), we found that 
performance was poor among 
students admitted under a 
government scheme for socially 
disadvantaged groups. 

All government and 
government-aided institutions in 
India allocate a fixed percentage 
of places on educational courses 
to socially and economically 
disadvantaged students. But in 
1995-2005, out of more than 
600 indigenous tribes with access 
to such positions, one small group 
from northern India accounted 
for 36% of students admitted to 
the AIIMS. 

Between 1998 and 2006, 
socially deprived students 
accepted into the AIIMS 
scored 13.6% less in the 
entrance exam than students 
from non-disadvantaged 
social classes (P< 0.001). In 
1989-98, such students also 
had double the dropout rate of 
non-disadvantaged students 
(6% versus 3%; P>0.05). In 
the ten years for which data 
are available (1995-2005), 

61.4% of students admitted to 


government-reserved positions 
had to resit examinations in at 
least one subject, compared with 
15.2% of non-disadvantaged 
students (P< 0.001). 

To address such inequality, 
India should adopt measures 
that have proved successful in 
other countries. These include 
wider access to quality primary 
education; standardized 
assessment of students; and 
academic support for students 
who are lagging behind. More 
research to assess this inequality 
is also needed to inform 
education policy. 

Manas Kaushik, Subha Ramani 
Boston University, USA. 
mkaushik@post.harvard.edu 


Fund experiments on 
atmospheric hazards 


The radioactivity released from 
Japan's damaged Fukushima 
Daiichi nuclear power plant has 
increased the urgency to fund 
tracer experiments that will 
improve models of atmospheric 
dispersion and reinforce 
confidence in emergency 
procedures. 

The last major tracer 
experiments were conducted 
in the mid-1990s. So the 
predictive capabilities of current 
atmospheric-dispersion models 
have not been properly tested, 
hindering their evaluation and 
development. 

To generate more 
observational data, multiple-scale 
atmospheric tracer experiments 
should use non-hazardous, 
climate-neutral substances and a 
realistic release term with varying 
source strengths. Modellers 
could estimate emissions in 
real time using a limited set of 
observations without knowing 
the actual release rates, and later 
improve their models and data- 
reconstruction methods on the 
basis of the real source terms and 
measurements. 

Stefano Galmarini European 
Commission Joint Research Centre, 
Italy. Andreas Stohl Norwegian 
Institute for Air Research, Norway. 
Gerhard Wotawa Central 
Institute for Meteorology and 
Geodynamics, Austria. 
gerhard.wotawa@zamg.ac.at 


19 MAY 2011 | VOL 473 | NATURE | 285 


© 2011 Macmillan Publishers Limited. All rights reserved 


OBITUARY 


William Nunn Lipscomb Jr 


(1919-2011) 


Chemist who discovered a new kind of bonding. 


illiam Nunn Lipscomb Jr could 
have made a career in music or 
science. He was an accomplished 


clarinetist who played Mozart with ease 
and grace, and attended the University of 
Kentucky in Lexington on a music schol- 
arship. It is chemistry’s good fortune that 
he ultimately chose science. His work on 
the boron hydrides led to a major rethink 
of how atoms bind together to form stable 
molecules. 

Lipscomb, who died on 14 April, was 
born in Cleveland, Ohio, to a physician 
father and housewife mother. The family 
moved to Lexington when he was a year 
old. Both his grandfather and great-grand- 
father had been physicians, and Lipscomb 
was expected to continue the family tradi- 
tion. But after graduating with a degree in 
chemistry from the University of Kentucky 
in 1941, he entered the graduate programme 
in physics at the California Institute of Tech- 
nology in Pasadena. 

Lipscomb soon returned to chemistry and 
under the influence of his mentor, Nobel 
laureate Linus Pauling, he developed an 
intense interest in chemical bonding. After 
completing his PhD in structural chemistry 
(during which he also conducted classified 
Second World War related research that 
involved, as he recounted, “walking around 
with beakers of nitroglycerine”), he joined 
the faculty at the University of Minnesota 
in Minneapolis in 1946. In 1959, he was 
appointed professor of chemistry at Harvard 
University in Cambridge, Massachusetts. 

Lipscomb’s research from 1960 onwards 
included important structural studies of 
enzymes. But it was his investigations of the 
boron hydrides, or boranes, from the late 
1940s until the 1970s that led to his being 
the sole recipient of the 1976 Nobel Prize 
in Chemistry. 

The impact of this work is best appreci- 
ated in light of the ideas on chemical bond- 
ing that prevailed at the time. Organic 
synthesis, a highly successful branch of 
chemistry concerned with manipulating 
hydrocarbons and their derivatives, is based 
on the perception that carbon atoms bind to 
other atoms through covalent bonds con- 
sisting of a pair of electrons. It was assumed 
that boron, a neighbour of carbon in the 
periodic table, would behave similarly, but 
its hydrides posed a major problem. 

Boron’s unconventional chemistry had 
fascinated and confounded researchers for 
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years. At first, following its isolation in 1808, 
the element seemed unremarkable, forming 
the expected trivalent compounds such as 
BCI; and B(CHs),. Its simplest hydride, it was 
thought, must be BH. William Ramsay, who 
had won a Nobel prize in 1904 for his discov- 
ery of the noble gases, was convinced of this. 
Not so the German chemist Alfred Stock, 
who took up the study of boranes in 1909. 
Using an ingenious vacuum glass apparatus 
of his own invention, Stock prepared and iso- 
lated a whole family of boron hydrides, none 


| 


of which was BH,. In fact, he painstakingly 
proved that the simplest borane was B,H, 
(diborane). For many years the molecular 
structures of these hydrides, which included 
B,H,); B5H,, Bs5H,) and B,)H,,, among oth- 
ers, remained unknown. In 1948, the basket 
shape of B,,H,,, which isa solid at room tem- 
perature, was established by the US chemist 
John Kasper and his colleagues. 

A general understanding of borane struc- 
tures emerged only when Lipscomb and his 
co-workers investigated the hydrides using 
both theory and X-ray crystallography. The 
challenge was formidable, as the boranes 
smaller than B,,H,, are gases or volatile 
liquids at room temperature. This meant 
that the crystals had to be grown in sealed 
capillaries at very cold temperatures using 
liquid nitrogen (-196 °C) and maintained in 
that state while X-ray diffraction data were 
collected. For B,H,, the crystallography 
needed even colder temperatures, requiring 
liquid helium (—269 °C), an approach that 
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had never before been attempted. 

Lipscomb and his colleagues discovered 
that these molecules had unprecedented 
cage-like shapes quite unlike hydrocar- 
bons, and they developed a detailed theory 
to explain why. The central idea was the 
‘two-electron, three-centre bond? in which 
three atoms are bound together by a single 
pair of electrons. Although three-centre 
B-H-B bonds had been postulated earlier 
by others, Lipscomb’s extension of the idea 
to the B~B-B bond was a major conceptual 
advance. 

This intuitive leap was the key to under- 
standing the borane structures. More 
broadly, it implied that such multicentre 
bonding might allow the stable existence 
of many other types of molecular clusters. 
Indeed it does, in the form of thousands 
of known carboranes (clusters containing 
carbon and boron) and other compounds 
that incorporate most of the elements in the 
periodic table, including the ‘nonclassical 
hydrocarbons’ such as the pyramidal- 
shaped C,(CH,),”* ion. 

Bill encouraged unconventional thinking 
even at the risk of occasional error, as he put 
it ina classic paper in 1954. Those of us who 
worked with the Colonel (he was delighted 
to be named a Kentucky Colonel by the 
state’s governor in 1973) came to appreci- 
ate his philosophy of the ‘intuitive leap’ as a 
method of advancing science. 

Throughout his scientific career, music 
remained a serious avocation. He served as 
principal clarinetist with the Pasadena and 
Minneapolis Civic Orchestras, helped to 
found the New Friends of Chamber Music 
in Minneapolis and played regularly for 
years with members of the Boston Sym- 
phony Orchestra. 

A member of the Baker Street Irregulars 
(devotees of Sherlock Holmes), Bill was also 
given to quoting from Arthur Conan Doyle 
and Lewis Carroll in his papers to make a 
point. These facets of his personality, as 
well as the revolution he fomented in the 
understanding of the covalent bond, form 
his lasting legacy. m 
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NEWS & VIEWS 


Species loss revisited 


Conservationists predict massive extinctions as a result of habitat loss. Habitat loss undoubtedly does drive extinctions, but 
dealing with an unmet assumption that underlies these predictions yields much lower estimates. SEE LETTER P.368 


CARSTEN RAHBEK & ROBERT K. COLWELL 


cientists generally agree that Earth is 

facing a biodiversity crisis, losing species 

100 to 1,000 times faster than the normal 
background rate of extinction’ and resulting in 
the sixth period of mass extinction in Earth’s 
history. On page 368 of this issue, He and Hub- 
bell’ provide a fresh perspective on predictions 
of the rate of this species loss. 

Previous periods of mass extinction were 
driven by global changes in climate and in 
atmospheric chemistry, bolide impacts and 
volcanism’. This time, species extinction is 
a result of interaction and competition for 
resources with another species — humans. 
We are immensely successful. Our numbers 
are many times higher than ecological theory 
would predict for a species with our life his- 
tory and body mass. We explore, populate 
and drastically alter almost all corners of the 
Earth and modify the global climate. Loss of 
habitat is predicted by various studies to cause 
the extinction of 20-50% of all species in just 
half a century’. These estimates began to sur- 
face decades ago, but sceptics have repeatedly 
demanded evidence of widespread extinc- 
tion, asking “Where are the bodies?’ If proof is 
not forthcoming, they argue, then politicians 
and decision-makers should denounce the 
biodiversity crisis as a myth’. 

He and Hubbell’ question the way that extinc- 
tion rates attributed to habitat loss have most 
often been estimated. Biologists have struggled 
for decades to estimate how many species are 
going extinct. Traditionally, the answer has 
relied on estimates based on an almost univer- 
sal ecological relationship — when we inventory 
the species in an area of natural habitat, the list 
grows as the area is increased. Using theoreti- 
cal or empirically derived functions to describe 
this species—area relationship (SAR), it has 
been assumed that, by working backwards 
along the SAR, one can estimate the number of 
species that would be lost to extinction ifa larger 
area were reduced by habitat loss. 

A classic rule of thumb says that if habitat 
area is reduced by 90% (comparable to actual 
habitat loss in many regions), roughly one-half 
of its species will be lost. He and Hubbell cite 
studies using SAR that predicted the loss of 50% 
of all species by the year 2000 — predictions 
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Figure 1 | Estimating species extinctions due to habitat loss. This hypothetical example shows the 
contrast between use of the backwards species-area relationship (SAR), traditionally used to predict 
extinctions, and the true endemics-area relationship (EAR) that correctly estimates extinctions with 
increasing area lost. The coloured circles under the graph represent the spatial ordering of 37 individuals 
(each occupying one unit of area) of 8 species along a transect through a habitat, each species indicated 
by a different colour. The total area surveyed increases with each individual encountered. As the first 
individual of each species is found, the SAR rises by one species, whereas the EAR is incremented only 
when the last individual of a species is accounted for along the transect. The backwards SAR mirrors the 
loss of species as area is reduced by moving right-to-left along the SAR. He and Hubbell’ demonstrate 
mathematically and with examples for trees and birds that, for realistic (aggregated) spatial patterns of 
individuals and species, the backwards SAR always lies above the true EAR, thus overestimating expected 
rates of extinction. Species aggregation is simulated here by placing dots of the same colour closer to one 


another than expected at random. 


that clearly have not been fulfilled. The dis- 
crepancy is well known and has often been 
explained as ‘extinction debt, a time-lag before 
populations reduced in numbers by habitat 
loss actually become extinct. Individuals of 
long-lived species may continue to reproduce 
or simply live on without reproducing, even if 
the current living space for the species cannot 
sustain viable populations over time. 

The authors” explain why this traditional 
‘backwards’ use of SAR is fundamentally 
flawed for typical spatial diversity patterns, and 
show that this approach can produce drastic 
overestimation of extinction rates. 

The problem with the traditional approach 
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is surprisingly simple. With increasing habitat 
area, the SAR rises by one species unit each 
time the first individual of a species new to the 
inventory is encountered (Fig. 1). Additional 
individuals of a species already encountered 
add nothing to the species count. By contrast, 
with decreasing habitat area an extinction does 
not occur until the last individual of a species 
is encountered. The authors show that, for the 
aggregated spatial patterns characteristic of 
species in real communities, the predicted 
number of extinctions rises more gradually with 
increasing habitat loss than predicted by the 
‘backwards’ SAR (Fig. 1). The curve that cor- 
rectly describes the rate of extinction as habitat 


area decreases is called the endemics—area 
relationship (EAR). This was proposed more 
than a decade ago by Harte and Kinzig’ and, 
they persuasively argued’, is more appropriate 
than the SAR for estimating species extinctions, 
especially under non-random spatial distribu- 
tions’. (A species is endemic if it is found only 
within some specified area.) 

In their novel conceptualization of the 
problem, He and Hubbell’ show that both the 
classic SAR and the EAR can be derived from 
a sampling theory based on spatially explicit 
patterns of individuals. Applying this approach 
to empirical data for woody plants in the rain- 
forest and North American birds, which show 
typical patterns of spatial aggregation, they 
quantify the substantial discrepancy between 
backwards-SAR-based and EAR-based extinc- 
tion rate predictions (finding overestimation 
as high as 160% for the plants). Importantly, 
the authors also justify the use of a simple 
approximation for the EAR that is robust to 
variation in species’ spatial patterns and scale. 

He and Hubbell, then, strongly question 
the use of SAR to estimate extinction rates not 
only from direct habitat loss, but also from pro- 
jected species-range contractions expected to 
occur under climate change (see ref. 8 for an 
example). But they emphasize that their results 
do not in any way imply that there is not an 
ongoing mass extinction of species, nor that 
extinction debt is not a genuine biological phe- 
nomenon. Even with a better way to estimate 
rates of future species extinctions, there is still a 
need to obtain the data required to use the EAR 
to make more rigorous estimates. There is also 
the daunting problem of rigorously inferring 
extinction — showing that the last individual 
of a species has indeed died. 

We invest heavily in infrastructure to store 
and make accessible the data we have, but by 
and large we have all but halted investment 
in discovering and describing the diversity 
of species with which we share the Earth. At 
best we have described only about 10% of all 
living multicellular species. If we ‘fog’ a tropi- 
cal tree, literally hundreds of insect species 
unknown to science fall to the ground. Every 
year, many new species of even the best-known 
groups, the mammals and birds, are described. 
For only a fraction of the known species do 
we have even a rough idea of their entire 
geographical distributions. 

Most of Earth's biodiversity occurs in tropi- 
cal regions where species occur at low density 
and tend to have tiny geographical ranges. The 
first individual of such a species encountered 
in a brief inventory is not far from the last to 
go when extinction threatens, compared with 
populous, widespread species at higher lati- 
tudes. Thus, when modifying tropical habitat 
through forestry, mining or agriculture, we 
rarely have an idea which species inhabit the 
environment we are about to affect, nor the 
exact consequences of our action. The ‘body 
bags’ are rarely counted. = 


Carsten Rahbek is at the Center for 
Macroecology, Climate and Evolution, 
Department of Biology, University of 
Copenhagen, DK-2100 Copenhagen, 
Denmark. Robert K. Colwell is in the 
Department of Ecology and Evolutionary 
Biology, University of Connecticut, Storrs, 
Connecticut 06269-3043, USA. 

e-mails: crahbek@bio.ku.dk; 
robert.colwell@uconn.edu 


NEWS & VIEWS | RESEARCH | 


1. Pimm, S.L., Russell, G. J., Gittlerman, J. L. & Brooks, 
T. M. Science 269, 347-350 (1995). 

2. He, F. & Hubbell, S. P. Nature 473, 368-371 (2011). 

3. Barnosky, A. D. et al. Nature 471, 51-57 (2011). 

4. Lomborg, B. The Skeptical Environmentalist: 
Measuring the Real State of the World (Cambridge 
Univ. Press, 2001). 

5. Harte, J. & Kinzig, A. P. Oikos 80, 417-427 (1997). 

6. Kinzig, A. P. & Harte, J. Ecology 81, 3305-3311 
(2000). 

7. Green, J. L. & Ostling, A. Ecology 84, 3090-3097 
(2003). 

8. Thomas, C. D. et al. Nature 427, 145-148 (2004). 


Bound and unbound 
planets abound 


Two teams searching for extrasolar planets have jointly discovered a new 
population of objects: ten Jupiter-mass planets far from their host stars, or 
perhaps even floating freely through the Milky Way. SEE LETTER P.349 


JOACHIM WAMBSGANSS 


wo decades ago, we had no idea whether 

planets orbiting stars other than the Sun 

existed at all. Today, more than 500 exo- 
planets have been discovered, and the field of 
exoplanet research has advanced to become 
one of the most captivating branches of astron- 
omy. Observational techniques now aim to 
address questions such as what the atmosphere 
and weather are like on some of these planets, 
and to determine their global statistical prop- 
erties. On page 349 of this issue, the MOA and 
OGLE research teams’ provide an exciting 
result for exoplanetary science: the discovery 
of a population of planets that have roughly 
the mass of Jupiter and separations from their 
putative host stars of at least ten times Earth's 
distance to the Sun. 

The teams’ finding’ is based on gravita- 
tional microlensing, an established technique 
for detecting exoplanets that is well placed for 
statistical studies of exoplanets. There are two 
particularly exciting aspects to the discovery 
of this new exoplanetary population. The first 
is the authors’ conclusion that, on average, 
there is more than one Jupiter-mass planet per 
Milky Way star. The second is the evidence that 
these planetary-mass objects could be at great 
distances from their host stars. Some of them 
could even be floating freely through the Milky 
Way — that is, they might not be gravitationally 
bound to any star at all. 

Gravitational microlensing is one of a suite 
of planet-search techniques. The methods are 
truly complementary to one another, each 
probing different planetary properties and 
having its own particular strengths”. But most 
of them detect and explore nearby exoplanets. 
By contrast, microlensing probes more distant 
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planets, using the host star-planet system as a 
magnifying glass. When a foreground star (the 
lens) passes in front of a distant, background 
star, the latter is magnified and displays a 
characteristic ‘light curve’. The two observa- 
bles that characterize such a microlensing event 
are the height of the light curve's magnification 
peak and the duration of the magnification, 
which depends, among other parameters, on 
the mass of the lens: the lower the mass, the 
shorter the duration. Originally proposed as 
a way of searching for dark matter, it soon 
became clear that microlensing could also 
be used to detect planetary systems’: a planet 
orbiting the foreground star would produce a 
secondary peak in the light curve (Fig. 1). 

Microlensing offers two advantages over 
other methods: it has the potential to yield the 
most representative statistical sample of Milky 
Way planets and it is, in principle, sensitive 
enough to detect Earth-mass objects”® with 
current technology. However, the downside is 
that microlensing events are rare: fewer than 
one in a million stars in the central part of the 
Milky Way are microlensed at any given time 
by a foreground lensing star. And even if every 
such lensing star had a Jupiter-mass planet at a 
few times the Earth-Sun distance, only about 
1% of these planets would be detected, owing 
to the exact geometric alignment required 
between the background star, the planetary 
system and an observer on Earth. So discover- 
ing such microlensing events is akin to finding 
a needle in a haystack. 

To tackle these statistical challenges, a 
handful of independent research teams have 
developed advanced techniques to monitor 
the brightness of about 100 million Milky 
Way stars every few days. These techniques 
have allowed the teams to routinely find about 
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Figure 1 | Planet microlensing. a, When a foreground star (red) passes in 
front of a distant, background star (yellow), it bends the background star’s 
light and causes it to brighten and fade with a characteristic ‘light curve’ b, For 
a foreground system composed ofa star and an orbiting planet (brown) that 
are close to each other, the brightening and fading can be accompanied by a 
sharp secondary peak due to the planet. c, If the host star and planet are far 


1,000 (stellar) microlensing events per year. So 
far, however, only about a dozen exoplanets 
have been detected by microlensing. Never- 
theless, impressive results have been derived 
on the abundance of planets in the Milky Way: 
planetary systems similar to our own are 
expected around one sixth of all stars’, and 
cold Neptune-mass planets are common’. 

In a specially designed study, the MOA 
(Microlensing Observations in Astrophysics) 
team’ monitored 50 million Milky Way stars for 
about two years, each at least once per hour. In 
this way, they were able to detect microlensing 
events of very short duration. In a careful anal- 
ysis of the data — which excludes all known 
sources of contamination — the team has 
now discovered’ 474 individual microlensing 
events, ten of which lasted for less than two 
days. The researchers then added independent 
data obtained by the OGLE (Optical Gravita- 
tional Lensing Experiment) team”, to substan- 
tiate their original conclusions that there are 
many more short-duration microlensing events 
than expected from the known population 
of stars and brown dwarfs in the Milky Way. 
The authors’ interpret this over-abundance of 
short events as being produced by a thus-far 
unknown population of Jupiter-mass objects. 

Because the observed light curves for the ten 
very short-duration microlensing events do 
not show any signature of a possible host star, 
the authors’ conclude that these Jupiter-mass 
objects must be located at distances from their 
host stars of at least ten times the Earth-Sun 
distance. When comparing their derived abun- 
dance of Jupiter-mass objects with upper limits 
on abundances of wide-separation exoplanets 
from direct detections, they’ argue that it is 
very likely that most of their newly discovered 
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planetary-mass objects are unbound. These 
conclusions prompt at least two questions. 

To be or not to be called a planet — that is 
the first (linguistic) question. After the first 
discovery, about a decade ago, of isolated low- 
mass objects in young star-forming regions”, 
a heated discussion ignited over what to call 
these entities. Among the contending denom- 
inations were ‘free-floating planets, ‘isolated 
planetary-mass objects; ‘objects formerly 
called planets’ and ‘rogue planets’ One of the 
contentious issues is whether the mass and the 
dynamic state of the objects concerned alone 
should determine their class name, or whether 
their formation history should also be consid- 
ered. The International Astronomical Union 
(IAU) succeeded, in 2006, in re-defining what 
a planet is. But it postponed the definition 
of an exoplanet. In light of the discovery of a 
probable new class of objects’, it may now be 
worthwhile to reconsider these definitions’””’. 

To be or not to be a bound planet — that 
is the second (astronomical) question. If 
these objects do turn out to be unbound, we 
want to understand how they reached this 
state. The MOA and OGLE teams provide! 
plausible arguments, but various hypotheses 
for the formation and dynamic state of the 
objects seem possible, and certainly deserve 
further investigation. Ultimately, the ques- 
tion of whether these objects are bound to 
stars or freely floating through the Milky 
Way will be answered through astronomical 
observations. In the former case, the relative 
motion between the background star and the 
foreground star—planet system will occasion- 
ally be oriented such that the background 
star will be magnified a second time by the 
focusing effect of the planet’s host star'*. This 
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apart, most observed light curves will display either the broad peak associated 
with the star or the sharp peak associated with the planet; very rarely will the 
alignment between the background star, the foreground planetary system 
and the observer on Earth be such that both peaks are observed (the two 
observations can be years apart). d, For an isolated planet without a host star, 
the observed light curve will always display a single, short-duration peak. 


second (broader) peak may well happen a few 
years after (or before) the planetary blip in 
the light curve. Another signature of a bound 
planet, known as astrometric microlensing, 
is a minute change in the position of the 
background star during the magnification”’. 

The implications of this discovery’ are 
profound. We have a first glimpse of a new 
population of planetary-mass objects in our 
Galaxy. Now we need to explore their proper- 
ties, distribution, dynamic states and history. 
A continuation of high-cadence ground- 
based microlensing observations will surely 
shed some further light on these objects. But 
dedicated observations by satellite telescopes 
with large viewing angles will be pivotal for a 
full understanding of this population. Well- 
developed concepts for such projects'®“* on 
both sides of the Atlantic guarantee a head 
start. Exploring unbound (former) satellites of 
stars with bound (future) satellite telescopes of 
planet Earth will open up a new chapter in the 
history of the Milky Way. m 
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A positive side of 


disaster 


In October 1998, a hurricane visited death and destruction on Honduras, with 
flooding and mudslides. A case history of a rural community documents how 
recovery from that event produced socio-economic improvement. 


ARUN AGRAWAL 


ate, to talk of natural disasters as windows of 

opportunity. Because they are major shocks 
to socio-economic systems, disasters such as 
Cyclone Nargis (Myanmar, 2008) or Hurri- 
cane Katrina (United States, 2005) are capable 
of wiping out entire settlements, destroying the 
lives and livelihoods of thousands, if not hun- 
dreds of thousands, and wreaking material 
destruction on a massive scale’. The poor suffer 
most’. But writing in Proceedings of the National 
Academy of Sciences, McSweeney and Coomes* 
suggest that even such disastrous events may 
sometimes yield positive economic outcomes 
for the rural poor. 

Four years after the devastation caused by 
Hurricane Mitch in Honduras in 1998 (Fig. 1), 
the authors found that the indigenous Tawahka 
community of Krausirpi, in the northeast of 
the country, was better off than before the 


lE may seem heartless, or at least inappropri- 


disaster. Furthermore, income and assets in the 
community were more equitably distributed. 
Households had on average three times more 
land; poorer groups and women had gained 
more land; agricultural production had been 
re-established; sources of income were more 
diverse; a new land-tenure system was in 
place; and the community was probably more 
resilient to future climate shocks. 

What explains this relatively positive pic- 
ture? Two processes helped: education among 
those who had been land-poor, and new norms 
of land tenure rooted in the diffuse character 
of decision-making among the Tawahka. 
Younger families with higher education found 
it easier to access new employment and wage 
opportunities offered by non-governmental 
organizations and state agencies; these same 
families tended to be smaller, and had previ- 
ously found it difficult to clear land. Krausirpi 
is a land-surplus community; after Mitch, 
local residents quietly laid claim to new areas 
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Figure 2 | Disasters, interventions and outcomes. 
The appropriate application and withdrawal of 
interventions can alter the social trajectory ofa 
hard-hit community. 


of arable land through individual negotiations 
— no explicit, centralized decision was made 
to adopt a different system of land allocation. 

McSweeney and Coomes, then, conclude 
that Hurricane Mitch effectively reset the 
social, economic and institutional machinery 
in Krausirpi. The hurricane forced all house- 
holds to look for new land and, in the process, 
to accept a new mechanism through which 
to allocate land. It encouraged local residents 
to look for new sources of income and access 
to assets. Finally, it permitted households to 
take advantage of existing human capital in 
novel ways because of wage opportunities 
that became available with the presence of 
development organizations. 

What does this example have to say to those 
interested in post-disaster reconstruction — 
to those many decision-makers and aid work- 
ers who want not only to provide immediate 
succour to victims of disasters but also to do 
so in a way that makes recovery sustainable 
and equitable*? The first point to highlight 
is that disasters are a natural ‘reset button — 
what happens in their wake is shaped by his- 
torical forces, to be sure, but they also enable 
greater leveraging power to new resources, 
fresh endeavours and innovative institutions, 
because older structures and processes lose at 
least part of their historical force. 

A second point is that it is possible to 
improve both incomes and equity in the wake 
of disasters. But doing so requires a focus not 
only on productive opportunities, but also 
on strategies that favour the poor and the 
less powerful; a focus on the creation of new 
income streams and assets, but through access 
strategies that are equitably distributed; and a 
focus on processes of institutional change, but 
those in which less advantaged households and 
groups have a voice. In each case, part of the 
attention is on enhancing incomes, but also, 
and importantly, on achieving an equitable 
outcome. Figure 2 summarizes these points. 

Finally, many analysts of disaster relief and 
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reconstruction have observed the tension 
between the imperative to provide immedi- 
ate relief and efforts to launch self-organizing 
processes of change that sustain recovery 
and equity’. Immediate disaster relief often 
requires quick decisions, has only limited 
opportunities for participatory interventions, 
and is typically externally driven. Long-term 
positive changes, on the other hand, require 
local buy-in and activation of a community's 
capacities both through strategic interventions 
and through disengagement at the appropriate 
time. Supporting internal institutional change 
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that is structured by the interests of weaker 
and poorer groups is a crucial precursor to 
disengagement. Knowing when to restrict or 
cease to provide external material support, and 
instead facilitate a transition to educational 
and institutional support mechanisms that 
recognize local capacities and provide oppor- 
tunities for the less powerful, is necessary for 
disaster relief to be effective in the long run. m 
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A deep foundry 


Melting and solidification of iron alloys in Earth’s core may explain structural 
complexity in the solid inner core, and alter the way we think about the 
dynamics of the deep interior. SEE LETTER P.361 


BRUCE BUFFETT 


extbooks depict Earth as having an 

onion-layered structure with a solid 

steel ball at the centre. The central 
body, known as the inner core, is thought to 
have formed by gradual cooling and solidifica- 
tion of the surrounding liquid outer core’. On 
page 361 of this issue, Gubbins and colleagues’ 
turn convention on its head by arguing that 
a large fraction of the inner core’s surface is 
melting. Our understanding of both 
the structure and the dynamics of the 
core may change as a consequence. 

The authors’ conclusion is based 
on a numerical model’ that simulates 
convection and magnetic-field genera- 
tion in the liquid core. Cooling of the 
liquid core drives convection, but it is 
the more massive and sluggish mantle 
surrounding it that regulates the rate 
of cooling. Spatial variations in heat 
flux at the top of the core exert a strong 
influence on the pattern of fluid flow’. 
In the authors’ simulations, cold fluid 
is focused into narrow plumes, which 
descend to the inner-core boundary 
and promote localized solidification. 
Elsewhere, a broad return flow is asso- 
ciated with warm fluid that persistently 
exceeds the melting temperature at the 
inner-core boundary. 

Temperatures in the fluid’s interior 
can exceed the boundary temperature 
because the core is mainly cooled from 
above rather than heated from below. 
Fluid parcels become warmer relative 
to a decreasing background tempera- 
ture if the parcels are not cooled at the 
average rate. Cooling produces net 
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growth of the inner core, and is so intense that 
solidification is required below narrow regions 
of cold fluid in order to offset large areas of 
melting in warmer regions. 

Fractionation of impurities in the liquid on 
solidification is expected to enrich the solid 
in iron’; therefore, melting should produce 
a dense liquid that pools on top of the inner 
core. Gubbins et al.” argue that such a melt 
layer offers a simple explanation for unusual 
values recorded for seismic velocities near the 
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Figure 1 | Melting at the inner-core boundary. Gubbins et al. 
suggest’ that warming in the liquid outer core produces widespread 
melting at the boundary with the solid inner core. In this phase 
diagram, the initial position of the inner-core boundary is defined by 
the intersection of the core temperature (geotherm; dotted blue line) 
and the melting temperature (liquidus; dotted red line). An increase 
in the geotherm (solid blue line) promotes melting of the iron-rich 
solid, diluting the concentration of impurities in the liquid. That 
raises the liquidus temperature (solid red line) until it intersects the 
warmer geotherm, re-establishing thermodynamic equilibrium. The 
dense melt resides within the porous solid near the top of the inner 
core, as shown in the inset. 
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surface of the inner core’®. In addition, hetero- 
geneity in grain size and composition at that 
surface may eventually be buried by inner- 
core growth, possibly explaining the complex 
structure detected inside the inner core’. All 
in all, melting of the inner core provides a tidy 
explanation for several observations, although 
a few details remain to be explored. 

The authors pay careful attention to several 
problems that arise in applying their model 
to Earth. Two additional points are worth 
mentioning. The first is the perennial con- 
cern about the validity of numerical models, 
given that the physical parameters are very far 
from realistic values. More specifically, could 
small-scale turbulence, largely absent from the 
current models, disperse cold plumes long 
before they reach the inner-core boundary? As 
the numerical models improve we can expect 
to gain better insight into their reliability. 

A second question involves the role of 
composition in the melting temperature. 
Impurities in the liquid core depress 
the liquidus temperature relative 
to that of pure iron by 600 kelvin or 
more®. An iron-rich melt is expected 
to solidify before a liquid with the bulk 
composition of the outer core. So what 
happens after the inner core melts? 

We expect the inner-core bound- 
ary to represent the top of a mushy 
region where solid and liquid coexist’. 
A temperature increase at the bound- 
ary initially promotes melting. 
However, a small amount of melt 
enriches the surrounding liquid in 
iron, which elevates the local liquidus 
temperature and brings the interface 
back into equilibrium (Fig. 1). Given 
the magnitude of the melting-point 
depression, a small amount of melt 
should be sufficient to compensate 
for thermal fluctuations in the liquid 
outer core. Would small variations 
in melt volume be detectable in seis- 
mic observations? This question will 
require a better understanding of the 
phase diagram. 

The work of Gubbins and col- 
leagues’ opens a door onto new 
enquiries. Melting and solidification 
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of the inner core allows greater interaction 
with the surrounding liquid core, and raises 
the possibility that surprising phenomena are 
yet to be discovered. Recent speculations” 
about a steady translational motion of the 
inner core demonstrate that strange things are 
possible. The final chapter of this story is yet 
to be written. m 
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Tet proteins in the 


limelight 


Tet proteins mediate the hydroxymethylation of DNA. New work reveals their 
function in gene regulation and the extent of their activity throughout the genome 
of embryonic stem cells. SEE ARTICLE 2.343 & LETTERS P.389, P.394 & P.398 


NATHALIE VERON & 
ANTOINE H. F. M. PETERS 


uring mammalian development, the 
D one-cell zygote gives rise to a multi- 

tude of cell types. This remarkable 
process is controlled by protein machines 
that interpret the genetic code and regulate 
the expression of genes, in part by chemically 
modifying chromatin (DNA-protein com- 
plexes). One such modification is the addition 
ofa methyl group at the 5-position of the cyto- 
sine base in DNA (5mC) — an alteration that 
serves a crucial role in the epigenetic (or cell- 
to-cell) inheritance of gene expression during 
development. However, proteins of the Tet 
enzyme family can modify this DNA mark fur- 
ther by hydroxylating the methyl group to form 
5-hydroxymethylcytosine (5hmC)’”. Five 
papers**, including four in this issue, report 
on the extent of 5hmC modification across 
the genome of mouse embryonic stem cells 
and on the role of Tet proteins in regulating 
gene expression. 

The 5mC modification is required for 
genome stability and thus the embryo’s via- 
bility. It is also needed for the repression of 
genes and repetitive genomic sequences; for 
X-chromosome inactivation; and for genomic 
imprinting (in which, for some genes, either 
the maternal or the paternal copy is expressed). 
Classical studies revealed that 5mC is erased 
in primordial germ cells and during early 
embryo development, and that this process 
occurs independently of DNA replication as 


the cells divide’. What’s more, genes contain- 
ing 5mC can become active in differentiated 
cells, supporting the notion of active demethyl- 
ation!"!. This idea, along with researchers’ 
ability to epigenetically reprogram cells (either 
by the technique of somatic-cell nuclear trans- 
fer or by induced pluripotency experiments), 
inspired the search for factors that mediate 
5mC demethylation, although initially there 
was limited success’. 

Because proteins of the Tet family (Tet1-3) 
can convert 5mC to 5hmC, they have been 
considered promising candidates for medi- 
ating DNA demethylation. But this novel 
enzymatic means of demethylation leads to 
obvious questions. Where in the genome do 
Tet proteins bind? How do they affect the sta- 
bility/turnover of 5mC? Does this influence 
gene expression during the cell cycle and in 
development? Do Tet proteins alleviate gene 
silencing by converting 5mC to 5hmC, or do 
they protect against aberrant de novo methyla- 
tion, thereby preventing silencing? And finally, 
how is 5hmC processed further? The latest 
studies*®* shed light on these issues. 

Williams et al. (page 343)*, Wu etal. and Ficz 
et al. (page 398)’ localized 5hmC in the 
genome of mouse embryonic stem (ES) cells 
using methods that predominantly recog- 
nize DNA sequences bearing multiple 5amC 
marks. Pastor and colleagues (page 394)° 
developed two alternative methods that pos- 
sess increased sensitivity for single 5hmCs. 
The general finding is that 5hmC levels across 
the genome are low. Nonetheless, the mark is 
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significantly enriched at CpG dinucleotides 
within genes, particularly at exons (protein- 
coding regions)***, and this is correlated posi- 
tively with gene-expression levels’, as reported 
before”. 

The authors*”’ further investigated 5hmC 
localization in ES cells deficient in the three 
enzymes that are involved in de novo DNA 
methylation and in the maintenance of this 
modification (TKO cells). Like 5mC, 5hmC is 
absent in these cells. Hence, 5hmC probably 
arises from the processing of pre-existing 5mC 
along the gene body during transcription. 

In mammalian genomes, CpG dinucleotides 
occur at roughly 60% of promoter sequences, 
forming ‘CpG islands’ (CGIs) that are largely 
devoid of 5mC. Remarkably, Williams et al.* 
and Wuet al.’ report that 5hmC is significantly 
enriched only at a fraction of such CGIs, pre- 
dominantly at inactive promoters, many of 
which are marked with two modifications to 
chromatin: histone 3 lysine 4 trimethylation 
(H3K4me3), which activates genes, and his- 
tone 3 lysine 27 trimethylation (H3K27me3), 
which is repressive. By contrast, Williams and 
colleagues* and Wu et al. (page 389)° show that 
Tetl occupies not just double-marked (biva- 
lent) repressed promoters, but also active pro- 
moters marked with H3K4me3 alone. There 
seems to be a strong correlation between Tet1 
occupancy and CpG density at CGIs and at 
other genomic regions’. 

Most 5hmC-positive promoters display 
low levels of 5mC’*. Does this mean that Tet] is 
involved in the turnover of pre-existing 5mC, 
for instance at bivalent promoters, to protect 
against inappropriate de novo DNA methyla- 
tion? If so, 54mC might be a transient inter- 
mediate of active demethylation, potentially 
triggering downstream processing pathways 
such as the base-excision repair mecha- 
nism’. The varying 5hmC levels among 
CGlIs (such as at H3K4me3/Tet1 promoters 
compared with H3K4me3/H3K27me3 pro- 
moters) raises questions about the occurrence 
and kinetics of 5mC and 5hmC turnover as 
a function of transcriptional activity and/or 
local chromatin configuration. 

Curiously, reducing Tet1 expression in nor- 
mal ES cells resulted in only a minor increase 
in 5mC levels at CGIs and along the gene 
body*’. This may be because a substantial 
fraction of 5hmC sites is devoid of Tet1*. It 
might further reflect technical limitations in 
reducing the active Tet] enzyme and/or par- 
tial functional redundancy between Tet1 and 
Tet2, which is also expressed in ES cells”’’. The 
true relevance of Tet enzymes to 5mC turnover 
therefore remains to be determined. 

Even more surprisingly, the experiments*” 
reveal that Tet1 predominantly has repressive, 
rather than activating, functions on its direct 
target genes. Many of the genomic regions that 
are inactivated by Tet1 are occupied by pro- 
teins of the Polycomb repressive complex 2 
(PRC2), which catalyses the formation of 


294 | NATURE | VOL 473 | 19 MAY 2011 


H3K27me3 and represses the transcription of 
genes involved in specifying cellular identity 
during development. Although the authors*” 
did not detect any direct biochemical inter- 
action between Tet1 and PRC2 proteins, Tet1 
depletion directly or indirectly alleviated tran- 
scriptional repression and PRC2 recruitment 
at genes to which both proteins were bound’. 
In addition, Tet1 binds directly to Sin3A, a 
co-repressor protein essential for inhibiting the 
transcription ofa subset of genes that are also 
repressed by Tet1*. By analysing the expres- 
sion of selected Tet1 target genes, Williams 
et al. further show in TKO cells (which lack the 
Tetl substrate 5mC) that, on Tet1 depletion, 
these genes are misregulated in a similar way to 
control cells*. So, at least in ES cells, Tet] seems 
to be required for transcriptional gene regula- 
tion independently of its enzymatic activity, 
and possibly for regulating the recruitment of 
proteins that define chromatin states at CGIs. 
Several strands of evidence point to a func- 
tional role for 5mC/5hmC turnover in differ- 
entiation. During ES-cell differentiation, Tet1 
and Tet2 levels decrease, as do 5hmC levels, 
whereas 5mC levels increase, concomitantly 
with changes in gene expression”’*. Moreover, 
around one-third of genomic regions con- 
taining 5hmC in ES cells acquire 5mC during 
development*. And 5hmC levels are almost 
ten times higher in cells of the brain’s cerebel- 
lum region that have stopped dividing than in 
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proliferating ES cells’*. Finally, Tet1 regulates 
gene expression induced by neuronal activ- 
ity and 5mC turnover in the dentate gyrus of 
the adult mouse brain”*. To clarify the role of 
Tet proteins in active demethylation, genetic 
approaches aiming at delineating their cata- 
lytic and non-catalytic functions are required, 
particularly in non-dividing cells and during 
development. And so the search for factors 
indispensable to DNA demethylation contin- 
ues. Stay tuned. = 
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Unveiling the 
Casparian strip 


The Casparian strip in plant roots is a diffusion barrier that directs water and 
solutes from the soil to the water-conducting tissues. Proteins involved in 
making the strip have at long last been identified. SEE LETTER P.380 


MARKUS GREBE 


r | The roots of land plants are usually in 
direct contact with the soil, which pro- 
vides water, nutrients and other solutes 

but may also contain toxic substances. To 

control the uptake of molecules from the soil, 
vascular plants are thought to have evolved 
the protective cell layer that surrounds their 
water-conducting system. This layer, called the 
endodermis, acts as an inner skin’. A dis- 
tinguishing mark of endodermal cells is the 
local thickening of their transverse cell walls, 
which was first described in 1865 by Robert 

Caspary’ and later named the Casparian strip. 

On page 380 of this issue, Roppolo and col- 

leagues’ now describe a family of proteins that 

are precisely located in the plaama membrane 
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adjacent to the Casparian strip and are needed 
for its formation. 

The Casparian strip contains the polymeric 
molecules lignin and suberin, which locally 
impregnate the transverse cell walls of endo- 
dermal cells (Fig. 1a)*. This impregnation 
hinders diffusion of water and solutes through 
the cell-wall space’”, thus forcing them to pass 
through the outer membrane (plasma mem- 
brane) and the interior (cytoplasm) of the 
endodermal cells that act as a selective barrier 
(Fig. 1b). Although regulators of asymmetric 
cell division and cell-fate determination during 
endodermis formation have been identi- 
fied®’, proteins specifically needed to make the 
Casparian strip have remained unknown. 

Roppolo et al.’ reasoned that messen- 
ger RNAs encoding proteins specifically 


involved in Casparian strip formation should 
be expressed in the endodermis and should 
include transcripts for membrane proteins. 
They therefore analysed mRNA transcripts 
strongly expressed in the root endodermis 
of the model plant Arabidopsis thaliana and 
identified five mRNAs encoding similar 
proteins that are predicted to contain four 
membrane-spanning regions. These pro- 
teins belonged to a large, uncharacterized 
plant-specific protein family. 

When the five proteins were fused to green 
fluorescent protein (GFP) and expressed 
in Arabidopsis plants, they were all found in 
the endodermis. Intriguingly, the proteins 
precisely localized to the plasma-membrane 
domain at the site of Casparian strip forma- 
tion, the Casparian strip domain (CSD), and 
were therefore named Casparian strip pro- 
teins 1 to 5 (CASP1-5). Roppolo et al. found 
that CASP1 and CASP3 act together during 
Casparian strip formation, because the strip 
remained patchy and not smoothly assem- 
bled in plants in which the casp1 and casp3 
genes had both been disrupted. CASP1 and 
CASP3 may even interact in a protein com- 
plex, because the proteins immunoprecipitated 
together and in complexes with other CASPs. 
Whether this involves direct interaction in 
CASP heteromers, and what role the other 
CASPs play, remain topics for future studies. 

Clearly, CASP1 and CASP3 are needed for 
the correct formation of the Casparian strip. 
But at what stage do they act? And are they 
structural components, or do they instruct 
strip assembly? Clues to the answers came from 
images of the distribution of the CASP1-GFP 
fusion protein in living roots. Roppolo et al.’ 
observed that localization of CASP1 at the 
CSD preceded establishment of the diffusion 
barrier between the outer and inner cell walls of 
endodermal cells, thus suggesting an early role 
for CASPs during Casparian strip formation. 
Strikingly, CASP1 was uniformly distributed 
within the plasma membrane during early 
cell differentiation, but it then relocated and 
became concentrated at the future CSD. Here, 
CASP1 first occurred at the plasma membrane 
in small patches that then progressively fused. 

Up to this stage, the distribution of CASP1 
within the cell was sensitive to the inhibition 
of a pathway — the endocytic trafficking 
pathway — that is needed for the uptake of 
proteins from the plasma membrane and 
for their redistribution within the cell. This 
implied that CASP1 is, at least in part, targeted 
from an overall distribution to its specific 
localization at the CSD by endocytic recycling. 
Intriguingly, once this specific localization was 
established, CASP1 became immobile, as if 
strongly attached to the cell wall. 

To see whether CASPs could instruct the 
formation of a specific plasma-membrane 
domain, Roppolo et al.* expressed CASPs in 
other tissues. This resulted in an even accumu- 
lation of CASPs at the plasma membrane, or, 
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Casparian strip 


Figure 1 | The Casparian strip. a, Schematic cross-section through an Arabidopsis root. The outermost 
cell layer (epidermis) surrounds the cortical cell layer, which in turn surrounds the endodermis. 

The transverse cell walls of endodermal cells that connect cell walls of cortical and vascular cells are 
impregnated by a cell-wall thickening, the Casparian strip (red). The plasma-membrane domain of 
differentiated endodermal cells in direct contact with the strip is called the Casparian strip domain (CSD) 
(green). It is the localization site of the previously unknown Casparian strip proteins (CASPs) identified 
by Roppolo and colleagues’. b, Close-up of the upper-left quarter of the root cross-section, showing the 
flow of water and solutes (blue) through the cell-wall space. Because of impregnation by the Casparian 
strip, water and solutes cannot cross transverse cell walls and are redirected to pass through the plasma 
membrane and the interior of endodermal cells. These act as selective diffusion barriers, before releasing 


water and selected solutes into the vascular system. 


in the case of CASP5, at internal cellular mem- 
branes. These findings indicate that additional 
endodermis-intrinsic factors are needed to 
direct CASPs to a specific plasma-membrane 
domain and that CASP5 differs from other 
family members. 

It remains to be seen whether CASPs them- 
selves instruct Casparian strip formation, but 
they are certainly the first proteins known to 
specifically participate in this process. More- 
over, their early redistribution to the CSD by 
endocytic recycling provides a mechanistic 
insight into the formation of this domain. 
These observations open the door for studies 
on how specific CASP targeting to the CSD 
is achieved, and what mechanisms underlie 
subsequent CASP immobilization. 

The availability of casp mutants should 
now allow the question of what the Casparian 
strip actually does to be addressed. Is it really 
important for the plant’s protection against 
abiotic and biotic stress factors'? How do casp 
mutants cope with boron toxicity, for example? 
Boron is an essential nutrient, but it is toxic for 
plants at high concentrations. Proteins regulat- 
ing boron influx and efflux across the plasma 
membrane can be found in the outer and inner 
endodermal plasma membrane, respectively*”. 
These transporters are excluded from the 
CSD**, where the CASPs are located. Thus, it 
will be interesting to see whether CASPs help 
to create a barrier that regulates boron flux. 

Roppolo et al.’ propose that the endo- 
dermis is functionally analogous to an animal 
epithelium, and point out that, with CASPs, 
plants have evidently invented their own way 
of making an equivalent to a diffusion barrier 
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between animal epithelial cells — the tight 
junction. Indeed, tight junctions contain 
proteins, the claudins”*, that do not obviously 
resemble CASPs at the sequence level but that 
are similar in their size and overall organi- 
zation into four transmembrane domains. 
Although claudins are essential for the forma- 
tion of diffusion barriers in animals”, Roppolo 
et al.” did not observe defective diffusion of a 
tracer through the malformed Casparian strip 
of plants lacking CASP1 and CASP3, probably 
because other CASPs could compensate for 
potential defects. 

Roppolo et al. have started to reveal how 
plants build the Casparian strip, and it will be 
exciting to learn how it acquires its properties 
as a diffusion barrier. We can look forward to 
further unveiling of the mechanisms of CASP 
action and interaction in the near future. m 


Markus Grebe is in the Umea Plant Science 
Centre, Department of Plant Physiology, 
Umea University, SE-90187 Umed, Sweden. 
e-mail: markus.grebe@plantphys.umu.se 


1. van Fleet, D. S. Bot. Rev. 27, 165-220 (1961). 
2. Caspary, R. Jb. Wissensch. Bot. 4, 101-124 
(1865/66). 
3. Roppolo, D. et al. Nature 473, 380-383 (2011). 
4. Zeier, J., Ruel, K., Ryser, U. & Schreiber, L. Planta 
209, 1-12 (1999). 
5. Robards, A. W. & Robb, M. E. Science 178, 980-982 
(1972). 
. Di Laurenzio, L. et al. Cell 86, 423-433 (1996). 
. Helariutta, Y. et al. Cel! 101, 555-567 (2000). 
. Alassimone, J., Naseer, S. & Geldner, N. Proc. Natl 
Acad. Sci. USA 107, 5214-5219 (2010). 
. Takano, J. et al. Proc. Nat! Acad. Sci. USA 107, 
5220-5225 (2010). 
10.Furuse, M. & Tsukita, S. Trends Cell Biol. 16, 
181-188 (2006). 


oO OND 


19 MAY 2011 | VOL 473 | NATURE | 295 


NatuULreiNnsiGHT 


CARDIOVASCULAR BIOLOGY 


Nature INSIGHT 


Cover illustration by 
Nik Spencer 


Editor, Nature 
Philip Campbell 
Publishing 

Nick Campbell 
Insights Editor 
Ursula Weiss 
Production Editor 
Nicola Bailey 
Senior Art Editor 
Kelly Buckheit Krause 
Art Editor 

Nik Spencer 
Sponsorship 
Gerard Preston 
Production 

Emilia Orviss 
Marketing 

Elena Woodstock, 
Hannah Phipps 
Editorial Assistant 
Hazel Mayhew 


The Macmillan Building 

4 Crinan Street 

London N1 9XW, UK 

Tel: +44 (0) 20 7833 4000 
e: nature@nature.com 


f& 


nature publishing group 


19 May 2011 / Vol 473 / Issue No 7347 


healthy vasculature is crucial to our survival: 
At vessels act as conduits to deliver oxygen and 

nutrients to every tissue in our body, remove waste 
and allow immune surveillance. Vascular endothelial 
cells also contribute to processes such as haematopoiesis 
and organ development during embryogenesis. Not 
surprisingly, vascular dysfunction is linked to diverse 
disorders, from cancer to eye diseases, and can also trigger 
heart failure and death. Understanding how vessels grow 
and function has huge potential for improving human 
health, and this exciting topic is developing at a rapid pace. 

Anti-angiogenic agents that block vascular endothelial 
growth factor (VEGF) are being used in the clinic to treat 
patients with cancer and eye diseases, but efficacy has 
been limited. Peter Carmeliet and Rakesh Jain review the 
frontline research into the mechanisms regulating normal 
and pathophysiological angiogenic growth: they discuss 
the clinical experience with VEGF blockers and describe 
recently identified signalling pathways that contribute 
to vessel growth, maturation and quiescence, which may 
provide new avenues to improve anti-angiogenic therapy. 

Mark Lindsay and Harry Dietz explore the pathogenesis 
of aortic aneurysm; an enlargement and weakening of the 
arteries, which can lead to fatal tearing. They show how 
research has implicated the cytokine transforming growth 
factor-B in the pathogenesis of this disorder, and how this 
has led to new therapeutic opportunities with losartan, an 
angiotensin II receptor antagonist. 

Atherosclerosis is a disease of the arterial wall, in which 
lipid-filled plaques develop in the inner lining of the 
artery. Preclinical research has led to many hypotheses 
about the pathogenesis of plaque formation. Peter Libby, 
Paul Ridker and Géran Hansson discuss the limitations of 
animal models of atherosclerosis and the difficulties with 
extrapolating these findings to human disease, and suggest 
better ways to consolidate preclinical and clinical research. 

A potentially fatal consequence of ruptured 
atherosclerotic plaques is myocardial infarction, which 
causes a massive loss of cardiomyocytes, leading to heart 
dysfunction or complete heart failure. In the last review of 
this series, Michael Laflamme and Charles Murry discuss 
the emerging prospects for regenerating the damaged 
myocardium, a dynamic area of research that draws from 
advances in stem-cell biology, developmental biology and 
tissue engineering. 

Clare Thomas 
Senior Editor 
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Molecular mechanisms and clinical 
applications of angiogenesis 


Peter Carmeliet’? & Rakesh K. Jain? 


Blood vessels deliver oxygen and nutrients to every part of the body, but also nourish diseases such as cancer. Over the 
past decade, our understanding of the molecular mechanisms of angiogenesis (blood vessel growth) has increased at an 
explosive rate and has led to the approval of anti- angiogenic drugs for cancer and eye diseases. So far, hundreds of thou- 
sands of patients have benefited from blockers of the angiogenic protein vascular endothelial growth factor, but limited 
efficacy and resistance remain outstanding problems. Recent preclinical and clinical studies have shown new molecular 
targets and principles, which may provide avenues for improving the therapeutic benefit from anti-angiogenic strategies. 


patrol the organism for immune surveillance, to supply oxy- 

gen and nutrients and to dispose of waste. Vessels also produce 
instructive signals for organogenesis in a perfusion-independent man- 
ner (Box 1). Although beneficial for tissue growth and regeneration, 
vessels can fuel inflammatory and malignant diseases, and are exploited 
by tumour cells to metastasize and kill patients with cancer. Because 
vessels nourish nearly every organ of the body, deviations from normal 
vessel growth contribute to numerous diseases. To name just a few, 
insufficient vessel growth or maintenance can lead to stroke, myo- 
cardial infarction, ulcerative disorders and neurodegeneration, and 
abnormal vessel growth or remodelling fuels cancer, inflammatory 
disorders, pulmonary hypertension and blinding eye diseases’. 

Several modes of vessel formation have been identified (Fig. 1). In the 
developing mammalian embryo, angioblasts differentiate into endothe- 
lial cells, which assemble into a vascular labyrinth — a process known 
as vasculogenesis (Fig. 1b). Distinct signals specify arterial or venous 
differentiation’. Subsequent sprouting ensures expansion of the vascular 
network, known as angiogenesis (Fig. 1a). Arteriogenesis then occurs, 
in which endothelial cell channels become covered by pericytes or vas- 
cular smooth muscle cells (VSMCs), which provide stability and control 
perfusion. Tissues can also become vascularized by other mechanisms, 
but the relevance of these processes is not well understood. For example, 
pre-existing vessels can split by a process known as intussusception, 
giving rise to daughter vessels (Fig. 1c). In other cases, vessel co-option 
occurs, in which tumour cells hijack the existing vasculature (Fig. 1d), 
or tumour cells can line vessels — a phenomenon known as vascular 
mimicry (Fig. le). Putative cancer stem-like cells can even generate 
tumour endothelium’ (Fig. 1f). Although debated, the repair of healthy 
adult vessels or the expansion of pathological vessels can be aided by the 
recruitment of bone-marrow-derived cells (BMDCs) and/or endothelial 
progenitor cells to the vascular wall. The progenitor cells then become 
incorporated into the endothelial lining in a process known as postnatal 
vasculogenesis. Collateral vessels, which bring bulk flow to ischaemic 
tissues during revascularization, enlarge in size by distinct mechanisms, 
such as the attraction and activation of myeloid cells’. 

The revascularization of ischaemic tissues would benefit millions, but 
therapeutic angiogenesis remains an unmet medical need. Instead, more 
success has been achieved by targeting the vascular supply in cancer 
and eye diseases. In this Review, we describe key molecular targets in 


B lood vessels arose in evolution to allow haematopoietic cells to 


angiogenesis and discuss the clinical experience with the most widely 
used class of anti-angiogenic agent — blockers of vascular endothelial 
growth factor (VEGF, also known as vascular permeability factor or 
VPF). Rather than providing an encyclopaedic survey, we focus on some 
of the recently discovered mechanisms and principles, and on targets 
with translational potential. 


Vessel branching, maturation and quiescence 
We first provide the current view of the sequential steps of vessel 
branching (quiescence, activation and resolution), before discussing 
the molecular players involved in more depth (Fig. 2). In a healthy 
adult, quiescent endothelial cells have long half-lives and are protected 
against insults by the autocrine action of maintenance signals such as 
VEGE NOTCH, angiopoietin- 1 (ANG-1) and fibroblast growth factors 
(FGFs). Because vessels supply oxygen, endothelial cells are equipped 
with oxygen sensors and hypoxia-inducible factors — such as pro- 
lyl hydroxylase domain 2 (PHD2) and hypoxia-inducible factor-2a 
(HIF-2a), respectively — which allow the vessels to re-adjust their shape 
to optimize blood flow. Quiescent endothelial cells form a monolayer of 
phalanx cells with a streamlined surface, interconnected by junctional 
molecules such as VE-cadherin and claudins. These endothelial cells are 
ensheathed by pericytes, which suppress endothelial cell proliferation 
and release cell-survival signals such as VEGF and ANG- 1. Endothelial 
cells and pericytes at rest produce a common basement membrane. 
When a quiescent vessel senses an angiogenic signal, such as VEGF, 
VEGF-C, ANG-2, FGFs or chemokines, released by a hypoxic, inflam- 
matory or tumour cell, pericytes first detach from the vessel wall (in 
response to ANG-2) and liberate themselves from the basement mem- 
brane by proteolytic degradation, which is mediated by matrix metal- 
loproteinases (MMPs) (Fig. 2a). Endothelial cells loosen their junctions, 
and the nascent vessel dilates. VEGF increases the permeability of the 
endothelial cell layer, causing plasma proteins to extravasate and lay 
down a provisional extracellular matrix (ECM) scaffold. In response 
to integrin signalling, endothelial cells migrate onto this ECM surface. 
Proteases liberate angiogenic molecules stored in the ECM such as VEGF 
and FGE, and remodel the ECM into an angio-competent milieu. To 
build a perfused tube and prevent endothelial cells from moving en 
masse towards the angiogenic signal, one endothelial cell, known as 
the tip cell, becomes selected to lead the tip in the presence of factors 
such as VEGF receptors, neuropilins (NRPs) and the NOTCH ligands 
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BOX1 


Perfusion-independent role of endothelial cells 


During embryogenesis, the invasion of endothelial cells into nascent 
organs confers inductive signals to promote organogenesis, even in 
the absence of blood flow. This suggests that endothelial cells not 
only form passive conduits for delivering oxygen but also establish 
organ-specific vascular niches, which stimulate organogenesis by 
the production of paracrine-tropic ‘angiocrine’ factors”. Endothelial 
cells show remarkable heterogeneity in different organs. These 
organ-specific endothelial cells release signals for pancreatic 
differentiation, reconstitution of haematopoietic stem cells and 


expansion of neuronal precursors, and give rise to haematopoietic 
progenitors by endothelial-to-haematopoietic transition. The 
vascular adventitia — the outer layer of vessels — also hosts vessel- 
resident stem and progenitor cells. Emerging evidence indicates 
that such perfusion-independent activities of endothelial cells also 
promote tumorigenesis”. In addition to constituting the building 
blocks of vessels and delivering nutrients and oxygen, tumour 
endothelial cells allow the recruitment of pro-angiogenic bone- 
marrow-derived cells. 


DLL4 and JAGGED1 (Fig. 2a). The neighbours of the tip cell assume 
subsidiary positions as stalk cells, which divide to elongate the stalk 
(stimulated by NOTCH, NOTCH -regulated ankyrin repeat protein 
(NRARP), WNTs, placental growth factor (PIGF) and FGFs) and estab- 
lish the lumen (mediated by VE-cadherin, CD34, sialomucins, VEGF 
and hedgehog) (Fig. 2b). Tip cells are equipped with filopodia to sense 
environmental guidance cues such as ephrins and semaphorins, whereas 
stalk cells release molecules such as EGFL7 into the ECM to convey 
spatial information about the position of their neighbours, so that the 
stalk elongates. A hypoxia-inducible program, driven by HIF-1a, ren- 
ders endothelial cells responsive to angiogenic signals. Myeloid bridge 
cells aid fusion with another vessel branch, allowing the initiation of 
blood flow. For a vessel to become functional, it must become mature and 
stable. Endothelial cells resume their quiescent phalanx state (Fig. 2c), 
and signals such as platelet-derived growth factor B (PDGF-B), ANG-1, 
transforming growth factor-6 (TGF-6), ephrin-B2 and NOTCH cause 
the cells to become covered by pericytes. Protease inhibitors known as 
tissue inhibitors of metalloproteinases (TIMPs) and plasminogen acti- 
vator inhibitor-1 (PAI-1) cause the deposition of a basement membrane, 
and junctions are re-established to ensure optimal flow distribution. 
Vessels regress if they are unable to become perfused. 


The VEGF family 

Given the complexity of a process such as angiogenesis, it is remark- 
able that a single growth factor, VEGE, regulates this process so pre- 
dominantly. The VEGF family consists of only a few members and 
distinguishes itself from other angiogenic superfamilies by the largely 
non-redundant roles of its members. VEGF (also known as VEGF-A) is 
the main component, and it stimulates angiogenesis in health and dis- 
ease by signalling through VEGF receptor-2 (VEGFR-2, also known as 
FLK1)”*. Neuropilins such as NRP1 and NRP2 are VEGF co-receptors, 
which enhance the activity of VEGFR-2, but also signal independently’. 
Similar to VEGFR-2 deficiency, the loss of VEGF aborts vascular devel- 
opment’. In response to a VEGF gradient, established by soluble and 
matrix-bound isoforms, tip cells upregulate DLL4 expression, which 
activates NOTCH in stalk cells; this downregulates VEGFR-2 expres- 
sion, rendering stalk cells less responsive to VEGF, thereby ensuring 
that the tip cell takes the lead’®. Soluble VEGF isoforms promote ves- 
sel enlargement, whereas matrix-bound isoforms stimulate branching. 
Paracrine VEGF, released by tumour, myeloid or other stromal cells, 
increases vessel branching and renders tumour vessels abnormal”, 
whereas autocrine VEGE, released by endothelial cells, maintains vascu- 
lar homeostasis’*. Emerging evidence indicates that the biological effect 
of VEGFR-2 signalling depends on its subcellular localization — for 
example, for VEGF to induce arterial morphogenesis, VEGFR-2 must 
signal from intracellular compartments”’. Activating VEGFR2 muta- 
tions cause vascular tumours, and genetic polymorphisms in VEGF and/ 
or its receptors co-determine pathological angiogenesis'*"”, whereas the 
blockade of VEGF signalling can target angiogenic vessels in malignant 
and ocular disease in humans. VEGF protein or gene transfer stimulates 


vessel growth in ischaemic tissues, but often in association with unde- 
sired leakage and vessel abnormalities. 

VEGF-C, a ligand of the VEGFR-2 and VEGFR-3 receptors, activates 
blood-vessel tip cells'®. VEGFR-3 is necessary for the formation of the 
blood vasculature during early embryogenesis, but later becomes a key 
regulator of lymphangiogenesis — the formation of new lymphatic 
vessels from pre-existing ones”. In zebrafish, in which the first embry- 
onic vein arises by segregation of venous-fated endothelial cells from a 
common precursor vessel, the sprouting of venous endothelial cells is 
restricted by VEGFR-2 but promoted by VEGFR-3 (ref. 18). Venous- 
derived angiogenesis in the arterial trunk also relies on VEGFR-3 
signalling. Anti- VEGFR-3 antibodies that inhibit receptor dimeriza- 
tion or ligand binding slow down tumour growth synergistically, and 
enhance the inhibition of tumour growth by VEGFR-2 blockade, mak- 
ing VEGFR-3 another anti-angiogenic candidate”. 

Originally discovered as a VEGF homologue, PIGF was also expected. 
to be an angiogenic factor. However, unlike VEGF, PIGF is dispensable 
for development and is relevant only in disease”. PIGF is a multitasking 
cytokine that stimulates angiogenesis by direct or indirect mechanisms, 
and also activates bone-marrow-derived endothelial progenitor and mye- 
loid cells, as well as stromal cells, to create a nurturing ‘soil’ for tumour 
cells, in addition to activating tumour cells’”. By skewing the polarization 
of tumour-associated macrophages (TAMs), the loss of PIGF improves 
vessel perfusion and maturation, and enhances responses to chemother- 
apy”. PIGF blockade by neutralizing anti-PIGF antibodies phenocopies 
the anti-angiogenic effects of genetic Plgf (also known as Pef) deficiency 
in spontaneous mouse tumour models and diseases such as ocular neo- 
vascularization”. Yet other PIGF-blocking strategies fail to inhibit the 
growth of tumours in transplantable tumour models”. The therapeutic 
potential of PIGF blockade in patients with cancer thus remains to be 
established. In preclinical models, PIGF protein or gene delivery increases 
the revascularization of ischaemic tissues. 

Deficiency of the VEGF family member VEGF-B in mice does not 
impair angiogenesis in normal development, and cannot compensate 
for VEGF blockade after birth’. VEGF-B has only restricted angiogenic 
activity in certain tissues such as the heart, yet it promotes neuronal sur- 
vival and induces metabolic effects”. Divergent effects of VEGF-B on 
pathological angiogenesis have been reported, and it has been shown to 
promote the growth of cardiac vessels, without inducing adverse effects 
such as increased permeability or leakage”. 

The precise role of the VEGFR-1 receptor (also known as FLT-1) 
in angiogenesis remains elusive’**. VEGFR-1 exists both as a mem- 
brane-anchored signalling-competent form and as a soluble secreted 
form (also known as sFLT-1). By trapping its ligands, sFLT-1 can assist 
the guidance of the emerging branch or inhibit sprouting altogether. 
Because of its weak tyrosine kinase activity, VEGFR-1 may act as a 
decoy for VEGF, moderating the amount of free VEGF available to 
activate VEGFR-2 and explaining why VEGFR-1 loss results in vessel 
overgrowth”. However, intracellular VEGFR-1 signalling in angio- 
genic endothelial, stromal and myeloid cells stimulates pathological 
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Figure 1 | Modes of vessel formation. There are several known methods of 
blood vessel formation in normal tissues and tumours. a—c, Vessel formation 
can occur by sprouting angiogenesis (a), by the recruitment of bone-marrow- 
derived and/or vascular-wall-resident endothelial progenitor cells (EPCs) 
that differentiate into endothelial cells (ECs; b), or by a process of vessel 
splitting known as intussusception (c). d-f, Tumour cells can co-opt pre- 
existing vessels (d), or tumour vessels can be lined by tumour cells (vascular 
mimicry; e) or by endothelial cells, with cytogenetic abnormalities in their 
chromosomes, derived from putative cancer stem cells (f). Unlike normal 
tissues, which use sprouting angiogenesis, vasculogenesis and intussusception 
(a-c), tumours can use all six modes of vessel formation (a-f). 


angiogenesis”°. VEGFR-1 signalling also promotes the growth of 
VEGFR- 1° tumour cells in response to autocrine VEGF production 
in an angiogenesis-independent manner”, and upregulates MMP9 in 
endothelial cells at the premetastatic site. There is evidence to suggest 
that VEGFR- 1* haematopoietic progenitors form a premetastatic niche 
in distant organs, but this finding is debated**”’. Neutralizing anti-PIGE, 
anti- VEGFR-1 and anti- VEGFR-2 antibodies are in early clinical devel- 
opment. 


The PDGF family 
For vessels to function properly, they must be mature and covered by 
mural cells. Several growth-factor families, such as PDGFs, angiopoi- 
etins and TGF-f, contribute to this process”. To stabilize endothelial cell 
channels, angiogenic endothelial cells release PDGF-B to chemoattract 
PDGF receptor-§ (PDGFR-8)* pericytes”’””. Hence, pericyte deficiency 
after PDGF-B ablation causes vessel leakage, tortuosity, microaneurysm 
formation and bleeding. Knockout of the genes encoding the PDGF-B 
protein retention motif (necessary for pericyte adhesion) in mice results in 
tumour vessel fragility and hyperdilation, whereas PDGFR-B-hypomorph 
mice have insufficient pericytes around brain vessels, leading to blood- 
brain barrier (BBB) defects and neurodegenerative damage owing to the 
leakage of toxic substances”. Tumour-derived PDGE-B also recruits peri- 
cytes indirectly by upregulating stromal-cell-derived factor-1a (SDF-1a; 
encoded by CXCL12). Besides a local origin, pericytes can also arise 
from perivascular PDGFR-f" pericyte progenitors, recruited from the 
bone marrow™. By inhibiting PDGFR-£ signalling in mural cells, VEGF 
reduces pericyte coverage and renders tumour vessels abnormal. 
PDGEFR inhibition diminishes tumour growth by causing pericyte 
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detachment, leading to immature vessels that are prone to regression”. 
Other pericyte-deficient mouse strains that lack the proteoglycan NG2 
(also known as CSPG4) also form abnormal tumour vessels and smaller 
tumours. Paradoxically, the overexpression of PDGF-B in mice inhi- 
bits tumour growth by promoting pericyte recruitment and inducing 
endothelial cell growth arrest*’. Because the survival of endothelial cells 
depends on pericyte VEGF production, pericytes protect endothelial 
cells from VEGF withdrawal and confer resistance to VEGF blockade. 
This protection requires a close endothelial-cell—pericyte interaction, 
as PDGF-B blockade reduces pericyte coverage and vessel number 
only when VEGF is produced by pericytes and not by more distant 
tumour cells”. Initial studies using multi-target receptor tyrosine kinase 
inhibitors (TKIs) showed that blocking PDGF-B renders mature vessels 
more sensitive to VEGF blockade by depleting the vessels of pericytes”. 
Recent studies with more specific inhibitors have shown that combina- 
tion therapy is no more efficient than anti-VEGF monotherapy”. 
PDGFR-6* pericytes have a dual role in metastasis. In primary 
tumours, pericytes limit tumour cell intravasation, because the more 
loosely assembled vessel wall is no longer a barrier for disseminating 
tumour cells after depletion of pericytes”. The absence of pericytes 
around vessels also correlates with metastasis in patients, and a trial 
evaluating PDGF-B blockade was aborted because of excessive leak- 
age. These studies indicate that blocking vessel maturation can promote 
malignancy. However, other reports have shown that pericytes, co-opted 
by tumour cells at micrometastatic sites, allow tumour colonization by 
releasing angiogenic factors. Overall, future studies are needed to explore 
the benefits and risks of PDGF blockade for the treatment of cancer. 
PDGEFE-B blockade may be used therapeutically for non-malignant 
vascular diseases such as pulmonary hypertension, whereas PDGF-B 
activation may offer therapeutic opportunities for stabilizing vascular 
malformations“’. PDGF-CC, another family member released by cancer- 
associated fibroblasts in VEGF-inhibitor-resistant tumours, stimulates 
vessel growth and maturation, and attenuates the response to anti- VEGF 
treatment”. By preventing the activation of perivascular PDGFR-a* 
astrocytes, which together with pericytes constitute the BBB, the blockade 
of PDGF-CC preserves the integrity of the BBB during stroke. Inhibition 
of PDGF-DD suppresses ocular neovascularization, whereas PDGF-DD 
overexpression normalizes tumour vessels and improves drug delivery. 


TGF-£ signalling 

Human hereditary haemorrhagic teleangiectasia is characterized by 
vascular malformations. Human genetic studies have shown that this 
disorder is due to mutations in the genes that encode endoglin (ENG) or 
activin receptor-like kinase (ALK1, also known as ACVRL1) — receptors 
of the TGEF-6 family. Mouse studies have confirmed that the loss of the 
TGE-B receptors ALK-1, TGFR-1 (also known as ALK-5), TGFR-2 or 
ENG results in arteriovenous malformations, reminiscent of those seen 
in patients with hereditary haemorrhagic teleangiectasia”. However, 
understanding the molecular basis of this pathway has been challenging 
owing to inconsistent results. This is partly due to the context-dependent 
pro- and anti-angiogenic effects of TGF-f family members. Furthermore, 
although TGF-8 promotes VSMC differentiation, and deficiency of ENG 
or ALK-1 impairs mural cell development, it remains unclear whether 
other TGF-6 components mediate their vascular effects in vivo by means 
of endothelial cells or VSMCs”. Preclinical studies have shown that 
antibodies against ENG or ALK-1 can inhibit tumour angiogenesis and 
growth. Several TGF-B blockers are now in early-phase clinical trials. 


The FGF superfamily 

The superfamily of FGFs and their receptors controls a wide range of 
biological functions“. bFGF was among the first discovered angiogenic 
factors and, like FGF1, has angiogenic and arteriogenic properties; FGF9 
stimulates angiogenesis in bone repair. FGFs activate receptors (FGFRs) 
on endothelial cells or indirectly stimulate angiogenesis by inducing 
the release of angiogenic factors from other cell types“. For instance, in 
the heart, FGF-mediated signalling fuels vessel growth by stimulating 
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Figure 2 | Molecular basis of vessel branching. The consecutive steps of blood 
vessel branching are shown, with the key molecular players involved denoted in 
parentheses. a, After stimulation with angiogenic factors, the quiescent vessel 
dilates and an endothelial cell tip cell is selected (DLL4 and JAGGED 1) to ensure 
branch formation. Tip-cell formation requires degradation of the basement 
membrane, pericyte detachment and loosening of endothelial cell junctions. 
Increased permeability permits extravasation of plasma proteins (such as 
fibrinogen and fibronectin) to deposit a provisional matrix layer, and proteases 
remodel pre-existing interstitial matrix, all enabling cell migration. For 
simplicity, only the basement membrane between endothelial cells and pericytes 
is depicted, but in reality, both pericytes and endothelial cells are embedded in 
this basement membrane. b, Tip cells navigate in response to guidance signals 
(such as semaphorins and ephrins) and adhere to the extracellular matrix 
(mediated by integrins) to migrate. Stalk cells behind the tip cell proliferate, 
elongate and form a lumen, and sprouts fuse to establish a perfused neovessel. 
Proliferating stalk cells attract pericytes and deposit basement membranes 

to become stabilized. Recruited myeloid cells such as tumour-associated 
macrophages (TAMs) and TIE-2-expressing monocytes (TEMs) can produce 
pro-angiogenic factors or proteolytically liberate angiogenic growth factors 
from the ECM. ¢, After fusion of neighbouring branches, lumen formation 
allows perfusion of the neovessel, which resumes quiescence by promoting 

a phalanx phenotype, re-establishment of junctions, deposition of basement 
membrane, maturation of pericytes and production of vascular maintenance 
signals. Other factors promote transendothelial lipid transport. 
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the release of hedgehog, ANG-2 and VEGF-B. Low levels of FGF are 
required for the maintenance of vascular integrity, as inhibition of FGFR 
signalling in quiescent endothelial cells causes vessel disintegration”. 
Aberrant FGF signalling promotes tumour angiogenesis and mediates 
the escape of tumour vascularization from VEGF- or epidermal growth 
factor receptor (EGFR)-inhibitor treatment**. Development of specific 
FGF or FGER inhibitors for blocking angiogenesis is lagging behind, 
partly because Fgfl or Fgf2 deficiency in mice did not produce vascular 
defects and the FGF superfamily shows substantial redundancy“. FGF 
protein or gene transfer has been tested for therapeutic angiogenesis, 
but without sustained success in the clinic. 


The ANG and TIE signalling system 

Healthy vessels must be equipped with mechanisms to maintain qui- 
escence, while remaining able to respond to angiogenic stimuli. The 
ANG and TIE family is a binary system to offer such a switch. The 
human ANG family consists of two receptors, TIE-1 and TIE-2, and 
three ligands, ANG-1, ANG-2 and ANG-4. ANG-1 functions asa TIE-2 
agonist, and ANG-2 functions as a competitive ANG-1 antagonist in a 
context-dependent manner (ANG-4 has not been as well studied, but is 
thought to act like ANG-1). Because no ligand for TIE-1 has been identi- 
fied, this orphan receptor may act as a negative regulator of TIE-2, but its 
precise role remains elusive”. ANG-1 is expressed by mural and tumour 
cells, whereas ANG-2 is released from angiogenic tip cells. In confluent 
endothelium, ANG-1 induces TIE-2 clustering in trans at cell-cell junc- 
tions to maintain endothelial cell quiescence“. ANG-1 also stimulates 
mural coverage and basement membrane deposition, thereby promot- 
ing vessel tightness. In the presence of angiogenic stimulators, sprouting 
endothelial cells release ANG-2, which antagonizes ANG-1 and TIE-2 
signalling to enhance mural cell detachment, vascular permeability and 
endothelial cell sprouting”’. In accordance, Tie2 (also known as Tek) 
deficiency in mice causes vascular defects, and activating germline and 
somatic TIE2 (TEK) mutations in humans result in venous malforma- 
tions. Tumour-derived ANG-2 also promotes angiogenesis by recruiting 
pro-angiogenic TIE-2-expressing monocytes (TEMs)”. 

The overall effects of the ANG-TIE system on tumours are context 
dependent”. ANG-1 stimulates tumour growth by promoting endo- 
thelial cell survival and vessel maturation, but it also inhibits tumour 
cell extravasation and maintains the integrity of healthy vessels out- 
side tumours. These conflicting biological activities warrant caution 
when considering ANG-1 as an anticancer target. Instead, ANG-2 may 
be a more appealing therapeutic target because it stimulates tumour 
angiogenesis and recruits pro-angiogenic TEMs, and ANG-2 inhibition 
promotes vessel regression and normalization”. Given that ANG-2 and 
VEGF cooperatively increase angiogenesis, co-blockade of VEGF and 
ANGs is superior in inhibiting tumour angiogenesis, metastasis and 
leakage*’. Various agents that block either TIE-2 or ANG-2 are being 
evaluated in early-phase clinical trials. 


The NOTCH and WNT signalling pathway 
The vessel-branching model postulates that, in general, tip cells migrate 
and stalk cells proliferate. Recent studies have implicated NOTCH sig- 
nalling in this model’®. In response to VEGF, activation of VEGFR-2 
upregulates DLL4 expression in tip cells. In neighbouring stalk cells, 
DLL4 then activates NOTCH, which downregulates VEGFR-2 but 
upregulates VEGFR-1; thus, the stalk cells become less responsive to the 
sprouting activity of VEGF but more sensitive to molecules such as PIGE. 
Overall, DLL4 and NOTCH signalling restricts branching but generates 
perfused vessels’”. By upregulating PDGFR- in NOTCH’ mural cells, 
DLL4 in endothelial cells also stimulates vessel maturation. JAGGED 1, 
another NOTCH ligand expressed by stalk cells, promotes tip-cell selec- 
tion by interfering with the reciprocal DLL4 and NOTCH signalling 
from the stalk cell to the tip cell. NOTCH signalling in stalk cells is 
dynamic over time, because it upregulates its own inhibitor, NRARP™. 
An unanticipated complexity is that endothelial cells continuously 
compete for the tip-cell position by fine-tuning their expression of 
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Tip cells sense guidance cues, similar to how axonal growth cones 
explore their surroundings. It is therefore not surprising that molecules 
used by navigating axons are evolutionarily conserved, and molecules 
such as VEGF also guide neuronal cells. Navigating endothelial cells 
express receptors for axon-guidance cues, including ephrin receptors 
(EPH); neuropilins (NRPs) and PLEXIN-D1, which bind semaphorins; 
ROBO4, which binds slit proteins; and UNC5B, which binds netrin 
proteins. Given the size and complexity of these families (and the 
existing controversies”’), we illustrate this concept with a few examples 
that have therapeutic potential. 

EPH receptors and their ligands, the ephrins, regulate cell-contact- 
dependent patterning and can generate bidirectional signals. The 
signalling cascade in ephrin-expressing cells is known as reverse 
signalling, whereas signalling in EPH-receptor-expressing cells is termed 
forward signalling. Ephrin-B2 and its receptor EPHB4 regulate vessel 
morphogenesis by several mechanisms. During vasculogenesis, the 
vascular plexus is marked by ephrin-B2* arterial and EPHB4* venous 
territories. By avoiding repulsive actions, ephrin-B2* and EPHB4* cells 
prevent intermingling and segregate from each other. In zebrafish, 
the emigration of venous-fated cells from a precursor vessel leads 
to segregation of ephrin-B2* arterial endothelial cells in the dorsal 
aorta and EPHB4* venous endothelial cells in the cardinal vein’®. 
Moreover, reverse signalling by ephrin-B2 in tip cells induces VEGFR-2 
internalization, which is necessary for downstream signalling of 
VEGFR-2 to elicit VEGF-induced tip-cell filopodial extension”. Ephrin-B2 
also promotes the recruitment of mural cells and bone-marrow-derived 
endothelial progenitor cells. In tumours, the overall effect of EPHB4 is 


Guidance signals in angiogenesis 


pro-angiogenic, making it a target for anti-angiogenic therapy. Indeed, 
upregulation of EPHB4 stimulates tumour angiogenesis, whereas 
EPHB4 blockade has the opposite effect. Other EPH receptors and 
ephrin ligands, such as EPHA2 and ephrin-Al, have a role in vessel 
growth and maturation™. Notably, ephrin-A1 levels are upregulated 
in tumours treated with VEGF blockers, suggesting that it contributes 
to resistance against VEGF blockade*®. Various therapeutics that 
target EPH receptors and ephrin ligands are being developed, but the 
complexity of this signalling system should be kept in mind. 
Semaphorins are secreted or membrane-anchored, and bind to 
plexin proteins or their NRP co-receptors. The loss of Pixnd1 in mice 
induces erroneous navigation of vessels, because endothelial cells 
cannot recognize the repulsive semaphorin-3E (SEMA3E) signals in 
their environment. Many semaphorins inhibit tumour angiogenesis, 
including SEMA3A, SEMA3B, SEMA3D, SEMA3F and SEMA4A, whereas 
SEMASC and SEMA4D promote tumour angiogenesis. NRPs bind 
ligands such as semaphorins and VEGF, but the vascular defects 
observed in Nrp1-deficient embryos are attributable to defective VEGF 
signalling, rather than defective semaphorin signalling. An antibody 
that blocks the binding of VEGF, but not of SEMA3A, to NRPs also 
inhibits tumour angiogenesis. Dual targeting with antibodies that block 
both VEGF and NRP1 is more effective than single-agent therapy, 
presumably because the antivascular remodelling effects of anti- 
NRP1 antibodies keep vessels in a VEGF-dependent state. In addition, 
a soluble NRP2B variant with increased VEGF affinity enhances the 
tumour-growth-inhibitory activity of an antibody that blocks the 
interactions of VEGF with VEGFR-2 but not with NRPs. 


VEGFR-2 versus VEGFR-1, indicating that this signalling circuit is con- 
stantly re-evaluated as cells meet new neighbours™. In accordance, the 
inhibition of DLL4 and NOTCH signalling induces the formation of more 
numerous but hypoperfused vessels, resulting in tumour hypoxia and 
growth inhibition®. However, chronic DLL4 blockage in healthy animals 
results in vascular neoplasms”, and endothelial cell inactivation of RBP-J, 
a transcription factor downstream of NOTCH, also leads to uncontrolled 
angiogenesis. Although these data indicate that quiescent phalanx cells 
need low-level NOTCH signalling, they also warrant caution against the 
indiscriminate use of DLL4 and NOTCH inhibitors for the treatment of 
cancer. Signalling by the hedgehog family members also participates in 
embryonic vasculogenesis, vascular morphogenesis and tube formation, 
as well as in arterial specification, by regulating NOTCH expression’. 
Endothelial cells express various types of WNT ligand and their friz- 
zled (FZD) receptors, of which several stimulate endothelial cell prolifera- 
tion. NOTCH activates WNT signalling in proliferating stalk cells during 
vessel branching”, explaining why NOTCH, which usually suppresses 
proliferation and promotes quiescence, stimulates proliferation of stalk 
cells in vivo. WNT also activates NOTCH in a reciprocal-feedback sys- 
tem, because WNT signals in endothelial cells induce a NOTCH-like 
phenotype, characterized by branching defects, loss of venous identity 
and aberrant vascular remodelling”. Gene-inactivation of some of the 
WNT and FZD members in mice (Wnt2, Wnt5a, Fzd4 and Fzd5) causes 
vascular defects, whereas the combined loss of Wut7a and Wnt7b impairs 
brain angiogenesis and BBB formation®. Because some WNT members 
inhibit angiogenesis, specific blockers of these proteins will be required. 


Integrins and proteases 

The ECM provides a physical link between vascular cells and their sur- 
rounding tissues. Endothelial cells possess mechanisms to interact with 
and alter the matrix. Integrins are heterodimeric receptors that medi- 


ate adhesion to ECM and immunoglobulin superfamily molecules”. 
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Upregulation of the integrins a,B, and a,B; permits growing endothelial 
cells to bind to provisional matrix proteins in the tumour milieu; these 
proteins include vitronectin, fibrinogen and fibronectin, both in native 
and degraded forms. These adhesive interactions provide survival cues 
and traction for invading endothelial cells. Other integrins involved in 
angiogenesis include a,3,, a8, 4,8,, 4581, 6B, a3, and a,(, (refs 59, 60). 
In addition to signalling induced by ligating ECM components, inte- 
grins regulate angiogenesis by other mechanisms. Given their ability to 
interact with several extracellular molecules and transmit signals in a bidi- 
rectional manner, integrins function as ‘hubs, orchestrating endothelial 
cell and VSMC behaviour during angiogenesis™*'. Hence, the binding 
of integrins to growth factors (such as VEGF, FGFs and ANG-1) or their 
receptors (VEGFR-2 and FGFRs) stimulates vessel growth. Integrins also 
upregulate and activate zymogen proteases in invading tip cells, and pro- 
mote vessel maturation by regulating interactions between endothelial 
cells, pericytes and the basement membrane. Other integrins promote 
the adhesion of angiogenic BMDCs to tumour endothelial cells. Recent 
studies have highlighted the complexities in understanding the role of a,B; 
in pathological angiogenesis, as tumour angiogenesis was stimulated by 
gene deficiency in mice but inhibited by pharmacological blockade”. 
Nonetheless, integrin blockers are now being evaluated in the clinic. 
Quiescent endothelial cells and pericytes share a common base- 
ment membrane, which not only physically restrains these cells but also 
keeps them quiescent owing to the antiproliferative properties of the 
ECM components. During branching, proteolytic remodelling of the 
ECM liberates these cells for unrestricted movement and converts the 
characteristics of the basement membrane into a pro-angiogenic envi- 
ronment. Distinct proteases such as MMPs modulate angiogenesis by 
several mechanisms™. They promote endothelial cell migration and tube 
formation by proteolytically remodelling the basement membrane, by 
executing directional matrix proteolysis (membrane type 1-MMP) or 
by exposing chemotactic cryptic motifs sites in the ECM. MMPs and 
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The prolyl hydroxylase domain (PHD) proteins PHD1-3 are oxygen- 
sensing enzymes that hydroxylate the hypoxia-inducible factor (HIF) 
proteins HIF-1a and HIF-2a when sufficient oxygen is available. Once 
hydroxylated, HIFs are targeted for proteasomal degradation. Under 
hypoxia, PHDs become inactive, and HIFs initiate broad transcriptional 
responses to increase the oxygen supply by angiogenesis, through 
the upregulation of angiogenic factors such as VEGF®’. HIFs are 
also activated in non-hypoxic conditions by oncogenes and growth 
factors, allowing tumour cells to stimulate angiogenesis before they 
become deprived of oxygen. In general, HIF-1a promotes vessel 
sprouting, whereas HIF-2a mediates vascular maintenance’’. Reduced 
HIF-1a levels in mice impair embryonic vascular development, 
revascularization of ischaemic tissues, and angiogenesis in injured 
tissues and tumours”. The use of HIF-1a inhibitors to block tumour 
or ocular angiogenesis has therefore received attention. Conversely, 
Hifla gene transfer in mice or activation of HIF-1a by pharmacological 
blockade of PHDs promotes ischaemic tissue revascularization. 
HIF-la also regulates tumour angiogenesis indirectly, by releasing 
chemoattractants such as SDF-1a to recruit pro-angiogenic BMDCs®. 
Gene silencing of Phd2 in mouse tumour cells enhances vessel growth 
by similar mechanisms. Hypoxia also regulates the polarization and 
pro-angiogenic activity of tumour-associated macrophages (TAMs) by 


Hypoxia and epigenetic regulation of angiogenesis 


means of HIF-1a and HIF-2a with different effects®. That hypoxia and 
inflammation are closely intertwined is illustrated by the finding that 
signalling by HIF-1a and nuclear factor-KB cross-activate each other. 
In certain cases, hypoxic upregulation of VEGF occurs independently 
of HIF-1a, and is mediated by the metabolic regulator PGC-1a in 
preparation for oxidative metabolism once the ischaemic tissue is 
revascularized®®. Because HIF signalling contributes to acquired 
resistance against anti-VEGF therapy, the combined blockade of VEGF 
and HIF-1a is being explored as a cancer treatment strategy. 

There is increasing evidence for epigenetic control of angiogenesis, 
particularly by non-coding microRNAs (miRNAs)®, which induce 
messenger RNA degradation or block translation. Because miRNAs 
target multiple genes, they are well positioned to regulate complex 
processes such as angiogenesis. Endothelial cells express several 
miRNAs that are induced by hypoxia or VEGF. Most of those stimulate 
angiogenesis by hijacking pro-angiogenic cascades, while suppressing 
angiostatic pathways”’. The expression of miR-126 is induced by 
the mechanosensitive transcription factor KLF2A and integrates the 
mechanosensory stimulus of blood flow to shape the vascular system!™. 
Endothelial-cell-specific loss of DICER, an exonuclease involved in miRNA 
biogenesis, impairs pathological angiogenesis. Angiogenic miRNAs seem 
to offer significant pro- or anti-angiogenic potential. 


plasmin also liberate angiogenic factors such as VEGF and FGF from 
immobilized matrix stores”. VEGF isoforms that are cleaved by MMPs 
(and therefore soluble) preferentially enlarge vessels, whereas MMP- 
resistant matrix-bound VEGF supports vessel branching™. Macrophages, 
neutrophils and mast cells initiate angiogenesis by MMP9-mediated 
activation of VEGF. Proteases such as MMP9 also participate in the 
mobilization of progenitors from the bone marrow by shedding soluble 
forms of membrane-bound cytokines (such as KIT ligand; also known 
as stem-cell factor or SCF)’. MMPs establish a premetastatic niche by 
allowing the recruitment of marrow progenitors”. Given their destruc- 
tive potential, the activity of proteases must be tightly controlled. For 
instance, loss of the inhibitor PAI-1 prevents vessel branching because 
excessive ECM breakdown leaves no matrix support for the sprout™. 
In addition, basement membrane deposition during vessel matura- 
tion requires the activity of MMP inhibitors such as TIMPs. Because 
degradation of ECM components can also generate anti-angiogenic frag- 
ments such as tumstatin and angiostatin’, protease inhibitors must be 
judiciously evaluated for biological effects. 


Junctional molecules 

Cell—cell communication is fundamental for vessels to act as a syn- 
chronized unit along their longitudinal axis. Such coordination is 
accomplished by cell-cell communication through gap junctions, 
established by connexins, which inform upstream feeding vessels 
about the perfusion status of downstream tissues to prevent shunting, 
a well-known defect in tumour vessels”. Apart from these long-range 
communication junctions, endothelial cells and pericytes have junc- 
tions for short-range communication. 

Quiescent endothelial cells form a monolayer of interconnected 
cells, whereas angiogenic endothelial cells dissociate their junctions to 
migrate. The tight junctional molecules claudins, occludins and junc- 
tional adhesion molecules maintain barriers, such as the BBB, whereas 
adherens junctions establish cell-cell adhesion, cytoskeleton remodelling 
and intracellular signalling”. Loss of VE-cadherin does not prevent ves- 
sel development, but induces defects in vascular remodelling and integ- 
rity’. VE-cadherin is also required for localizing CD34 and its sialomucin 
receptor to cell-cell contacts for lumen formation”’. In quiescent phalanx 


endothelial cells, VE-cadherin promotes vessel stabilization by inhibiting 
VEGFR-2 signalling while activating TGFR pathways. Notably, oxygen 
sensors control VE-cadherin expression in a feedback loop, so vessel 
perfusion can be optimized when the oxygen supply is insufficient”. 
N-cadherin stabilizes contacts between endothelial cells and pericytes. 
During sprouting, the adhesive function of VE-cadherin between adja- 
cent cells is reduced by endocytosis in response to VEGF and angiogenic 
factors’”’. At the same time, the localization of VE-cadherin at filopo- 
dia allows tip cells to establish new contacts with cells on outreaching 
sprouts. Antibodies, recognizing neoepitopes of VE-cadherin that are 
exposed after dissociation of adherens junctions during sprouting, offer 
opportunities for selective blockage of endothelial cell growth without 
affecting endothelial cell maintenance. 


Chemokines and G-protein-coupled receptors 
Chemokines regulate angiogenesis by recruiting pro-angiogenic 
immune cells and endothelial progenitor cells, or through the direct 
activation of endothelial G-protein-coupled chemokine receptors 
(GPCRs). A well-known chemokine is SDF-1a, which binds to its recep- 
tor CXCR4 on tip cells”. SDF-1a is upregulated by HIF-1a in hypoxia, 
and supports mobilization and retention of pro-angiogenic CKCR4* 
BMDCs to promote revascularization of ischaemic organs. Cancer- 
associated fibroblasts also release SDF-1a. Another chemokine is the 
biologically active lipid sphingosine-1-phosphate (S1P), which binds to 
the S1P family of G-protein-coupled receptors (S1PRs) and regulates 
endothelial cell barrier function, vessel stability and angiogenesis, in 
part by crosstalking to PDGF and VEGF receptors — a process known 
as GPCR-jacking”™. Inhibitors of SDF-1a, CXCR4 and S1P are being 
developed for cancer treatment”. Another recently identified GPCR is 
GPR124, which regulates BBB differentiation”. 


Other pathways and challenges in translation 

Other pathways also regulate angiogenesis, some of which provide guid- 
ance signals to navigating tip cells (Box 2). Given the ancestral function of 
vessels to supply oxygen, vessel formation is under the control of oxygen 
sensors (Box 3). As already mentioned, various anti-angiogenic avenues 
in addition to VEGF blockade are under development”. A challenge for 
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the future will be to identify optimal treatment regimens for these agents, 
either as monotherapy or in combination with VEGF blockade. 

A major hurdle in translating the above-mentioned insights into 
clinically successful treatments stems from the fact that various anti- 
angiogenic approaches have different effects or are more effective 
in preclinical than clinical settings. This divergence may be due to 
several factors. First, most preclinical studies examine the effect of 
anti-angiogenic agents on transplantable, rapidly growing primary 
tumours, whereas most anti-angiogenic drugs have been approved for 
spontaneously arising, slowly evolving cancers in metastatic settings or 
for advanced disease in patients. The spontaneous tumour models in 
genetically engineered mice also do not recapitulate various aspects of 
the human disease. Differences in malignancy, vascularization and the 
stromal microenvironment between humans and mice lead to differ- 
ent responses. Second, only a few preclinical studies have analysed the 
effects of anti-angiogenic agents in residual disease after cytoablative 
therapy or in the adjuvant setting, and often without chemotherapy. 
Third, the doses used in preclinical mouse studies are often higher than 
those given to patients, resulting in more pronounced antivascular and 
antitumour effects in mice. Fourth, most genetic studies use mice in 
which the relevant angiogenic gene has been deleted before tumours 
become established, which is different from pharmacological interven- 
tion in patients after the cancer has become detectable. Finally, the dose 
and schedule of anti-angiogenic and chemotherapeutic drugs in the 
clinic have not been optimized, owing to cost and other considerations". 


Clinical anti-angiogenesis with VEGF blockers 

Several VEGF blockers have been approved for clinical use in cancer 
and eye diseases”. So far, the US Food and Drug Administration has 
approved the use of the VEGF-neutralizing antibody bevacizumab 
(Avastin) for metastatic colorectal cancer, metastatic non-squamous 
non-small-cell lung cancer, metastatic breast cancer, recurrent glioblas- 
toma multiforme (GBM) and metastatic renal cell carcinoma (RCC) 
(Table 1). In addition, several multi-targeted TKIs, which block the 
signalling of pathways such as VEGE, have been approved, including 
sorafenib (Nexavar) for metastatic RCC and unresectable hepatocel- 
lular carcinoma, and sunitinib (Sutent) and pazopanib (Votrient) for 
metastatic RCC (Table 1). Recently, vandetanib (Zactima) has been 
approved for unresectable or metastatic medullary thyroid cancer and 
sunitinib has been recommended for approval for advanced pancreatic 
neuroendocrine tumours, but the clinical data have not yet been pub- 
lished. Treatment with VEGF inhibitors generally prolongs the survival 
of responsive patients with cancer of the order of months (Table 1). Two 
anti- VEGF compounds — intravitreous injection of the VEGF aptamer 
pegaptanib (Macugen) and the anti- VEGF Fab antibody ranibizumab 
(Lucentis) — have been approved for treatment of the wet (neovascu- 
lar) form of age-related macular degeneration, which causes blindness 
owing to the formation of leaky neovessels. Bevacizumab is also used 
off-label for this condition. 

Notwithstanding these successes, the clinical use of VEGF blockers 
in patients with cancer has shown that anti-angiogenic therapy is more 
challenging than anticipated. For example, VEGF receptor TKIs are 
effective as monotherapy in certain cancers, but fail in others or are 
toxic when combined with chemotherapy*™. The use of bevacizumab 
is approved only when combined with cytotoxic or cytokine therapy 
(with the exception of patients with GBM). Many patients with meta- 
static disease are refractory or acquire resistance to VEGF inhibitors”, 
and biomarkers to identify responders are missing™. In a recent trial, 
bevacizumab prolonged disease-free progression but not overall sur- 
vival in patients with metastatic RCC”%, and failed to show benefit in 
the adjuvant setting”. Moreover, questions have begun to arise about 
whether anti-angiogenic therapy causes cancer cells to become more 
malignant”. What are the reasons for these problems, and what can 
be done to move forward? The discussion in the next sections does 
not offer an answer to the daily challenges in oncological practice, but 
provides some avenues for developing future strategies. 
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Table 1 | Overview of anti-angiogenic drugs in cancer 


Drug Approved indication Improvement Improvementin Improvement in 
in RR(%) PFS (months) OS (months) 
Bevacizumab Metastatic 0 44 4.7* 
colorectal 0 14 14* 
cancer (with : , 
chemotherapy) 78 2.8 2.5* 
4.1 2.6 2.1¢ 
etastatic 20 1:7 2.0% 
non-squamous a % 
NSCLC (with 0.3-14.0 0.4-0.6 R 
chemotherapy) 
etastatic breast 5.7 5.9 oe 
cancer (with 9-18 08-19 Se 
chemotherapy)+ — 
1.8-13.4 1.2-29 S* 
9.9 2.1 St 
Recurrent GBM Currently only phase II data reported 
(monotherapy) 
Metastatic RCC 8 48 S* 
(with IFN-a) 24 33 se 
Sunitinib etastatic RCC 35 6.0 4.6* 
Sorafenib etastatic RCC 8 2.7 St 
Unresectable HCC NS 2.8* 
2 14 23° 
Pazopanib etastatic RCC 27 5.0 Rr 
Anti-angiogenic therapies currently approved by the US Food and Drug Administration (FDA) for 
treatment of malignancies. Per indication, the results of various trials are shown. The data show the 
improvement observed after the addition of the anti-VEGF therapy. GBM, glioblastoma multiforme; 
HCC, hepatocellular carcinoma; IFN, interferon; NR, not reported; NS, not significant; NSCLC, 


non-small-cell lung carcinoma; OS, overall survival; PFS, progression-free survival; RCC, renal cell 
carcinoma; RR, response rate. For reference, see http://clinicaltrials.gov. 

*First-line therapy. 
Second-line therapy. 
The FDA recommended the withdrawal of bevacizumab for breast cancer in December 2010; 
this is under appeal, with a hearing expected in June 2011. However, bevacizumab is approved for 
metastatic breast cancer in Europe, except in the United Kingdom. 


Refractoriness to VEGF blockade in advanced cancer 

A fraction of patients with cancer are refractory to VEGF-inhibitor 
treatment”. The extent of refractoriness varies from one cancer to 
another, differs between micro- and macrometastatic disease, and 
differs for various types of VEGF blocker. Patients can be intrinsically 
refractory and never show any response to treatment, or develop eva- 
sive resistance during the course of treatment. Several mechanisms 
have been proposed to explain these phenomena, which are related to 
changes in the tumour cells, endothelial cells or other stromal cells*'*46% 
(Fig. 3). Itis important to note that these mechanisms have been identi- 
fied for advanced, late-stage, macrometastatic disease only. 

Tumour angiogenesis can become VEGF independent at a more 
advanced stage because of the production of other pro-angiogenic mol- 
ecules, and thus respond poorly to VEGF blockade. Hypoxia induced by 
vessel regression after VEGF blockade can also switch on a more invasive 
and metastatic program, whereas in other cases, cancer (stem) cells can 
become hypoxia-tolerant when acquiring extra mutations and survive 
in poorly oxygenated niches. VEGF blockade inhibits sprouting angio- 
genesis, but may not beas efficient in suppressing other modes of tumour 
vascularization, relying on the recruitment of BMDCs, vessel co-option, 
vasculogenic mimicry or vessel splitting. Certain tumours, such as pan- 
creatic carcinoma, contain a hypovascular stroma and are therefore less 
sensitive to anti-angiogenic agents. Vessel pruning by VEGF blockade 
can aggravate hypoxia, resulting in the upregulation of angiogenic fac- 
tors such as PIGE, FGFs, chemokines and ephrins, and this may rescue 
tumour vascularization”. Some tumour endothelial cells show signs of 
cytogenetic abnormalities and transforming stem-cell potential®', which 
could alter sensitivity to VEGF inhibition. Furthermore, GBM-like stem 
cells can differentiate into tumour endothelial cells, and VEGF blockers 
can only partially inhibit this process’. 

Other stromal cells contribute to the resistance to VEGF blockade. 
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a Production of angiogenic 
factors, hypoxia tolerance or 
increased invasiveness of more 
malignant and hypoxic 
tumour cells 


b Vessel lining by tumour cells or 
ECs derived from putative CSCs, 
vessel co-option or intussusception 


d BMDC recruitment or 


CAF activation ¢ Pericyte-covered tumour vessels 


Figure 3 | Potential mechanisms of resistance to targeted VEGF therapy 

in cancer. Different mechanisms underlie the resistance to VEGF blockade 
seen in some patients with cancer. These mechanisms are not exclusive, and it 

is likely that several occur simultaneously in a single tumour. a, In established 
tumours, VEGF blockade aggravates hypoxia, which upregulates the production 
of other angiogenic factors or increases tumour cell invasiveness. Tumour cells 
that have acquired other mutations can also become hypoxia tolerant. The more 
malignant tumour cells are shown as dark green, blue and purple cells. b, Other 
modes of tumour vascularization, including intussusception, vasculogenic 
mimicry, differentiation of putative cancer stem cells (CSCs) into endothelial 
cells (ECs), vasculogenic vessel growth and vessel co-option (all denoted by the 
mosaic red—purple vessels), may be less sensitive to VEGF blockade. c, Tumour 
vessels covered by pericytes (green) are less sensitive to VEGF blockade. d, 
Recruited pro-angiogenic BMDCs (yellow), macrophages (blue and purple) 

or activated cancer-associated fibroblasts (CAFs; orange) can rescue tumour 
vascularization by the production of pro-angiogenic factors. 


Hypoxia promotes the recruitment of angiocompetent BMDCs, includ- 
ing TEMs, TAMs, neutrophils, mast cells and CD11b*GR-1* (also 
known as ITGAM’Ly6G") myeloid-derived suppressor cells, which 
release angiogenic signals such as VEGF, BV8 (also known as PROK2) 
and MMPs™. VEGF blockade is often combined with chemotherapeu- 
tics — by sensitizing endothelial cells to cytotoxic damage, VEGF inhib- 
itors impair endothelial cell survival and regrowth, but recruitment of 
BMDCs after chemotherapy can revascularize tumours (‘vasculogenic 
rebound’)*’. The release of angiogenic factors such as PDGF-CC by 
cancer-associated fibroblasts also contributes to resistance. Further- 
more, vessels in most tumours are covered with few pericytes, but 
microvessels in some cancers acquire a dense pericyte coat with a thick 
basement membrane; such mature vessels are usually less sensitive to 
VEGF blockers*”. Understanding the molecular basis of these cancer- 
type-dependent resistance mechanisms against VEGF blockade offers 
opportunities to improve anti-angiogenic treatment. 


VEGF blockers in the adjuvant setting 

On the basis of the clinical experience with VEGF inhibitors in 
macrometastatic cancer, VEGF blockers were anticipated to be ben- 
eficial for micrometastatic disease in the adjuvant setting (that is, after 
surgical resection of the primary tumour). However, compared with 
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chemotherapy alone, adjuvant treatment of patients with micrometa- 
static disease in combination with bevacizumab and chemotherapy 
failed to prolong disease-free survival after three years”. The admin- 
istration of an anti- VEGF antibody initially prolonged disease-free 
survival in patients, but this benefit was lost after three years. The 
precise reasons for this remain unclear. It is possible that the micrometa- 
static tumour cells were less responsive to anti- VEGF treatment because 
they were ina state of angiogenic dormancy™. The recruitment of pro- 
angiogenic BMDCs may convert micrometastasis to macrometastasis, 
but it unclear whether VEGF blockade eliminates this rescue pathway. 
Another hypothesis is that an angiogenesis rebound occurs after the 
arrest of anti-VEGF treatment, as documented in animal models”. 
However, such vascular rebound did not occur after long-term treat- 
ment with a pan- VEGER TKI in patients with GBM*. 

Another question is whether the transient disease-free survival benefit 
of anti- VEGF treatment was attributable to a change in the nature of 
the disease, and whether VEGF blockade caused the cancer to become 
more malignant after an initial delay. Some preclinical models show 
that VEGF blockade aggravates hypoxia and induces a pro-tumorigenic 
inflammatory state, which promotes invasiveness and metastasis, despite 
inhibition of primary tumour growth and prolongation of survival’*”. 
However, another preclinical study reported no effect of VEGF blockade 
on metastasis in the adjuvant setting”, and clinical trials have not shown 
an increase in malignancy or tumour-growth rebound after VEGF block- 
ade, at least not in the metastatic setting. Moreover, a recent randomized 
phase II trial showed that continuous dosing and discontinuous dosing 
(four weeks on and two weeks off) of sunitinib have the same outcome 
in RCC patients. Finally, a meta-analysis of advanced cancers shows that 
VEGF blockade does not aggravate metastatic disease’. Recurrent GBM 
is an exception, in which VEGF blockade increased tumour invasion, 
but even in these studies, tumours might have become more malignant 
because the treatment prolonged survival and allowed the cancer to 
progress further. Overall, there is an urgent need for an improved mecha- 
nistic understanding of vessel growth and resistance to anti-angiogenic 
therapy, particularly in micrometastatic lesions. 


Tumour vessel abnormalities as a future target 
Another parameter that could determine the overall efficiency of anti- 
VEGF therapy is the abnormal nature of tumour vessels. Tumour vessels 
become abnormal in almost all aspects of their structure and function”. 
They are heterogeneous, tortuous, branch chaotically and have an 
uneven vessel lumen. In addition to abnormal endothelial cells, peri- 
cytes and the basement membrane are also abnormal. Owing to the 
leakiness of tumour vessels, escaping fluid raises the interstitial fluid 
pressure. As a result, blood flow is heterogeneous, and oxygen, nutrients, 
immune cells and drugs are distributed unevenly. Because radiation 
therapy and many chemotherapeutics rely on the formation of oxygen 
radicals to kill cancer cells, tumour hypoxia reduces their efficacy. These 
vessel abnormalities create a hostile milieu, characterized by hypoxia, 
low pH and high fluid pressure, which can select for more malignant 
cancer cells and lower barriers to their escape through leaky vessels. 
These findings raise questions for the future. Excessive vessel pruning 
and growth arrest by anti-angiogenic agents could aggravate tumour 
invasiveness and metastasis by increasing hypoxia and creating a pro- 
tumorigenic inflammatory state. Vessel normalization could provide 
new therapeutic opportunities to slow down tumour invasiveness and 
dissemination, and increase tumour responses to chemotherapeutics 
and radiotherapy”. Another consideration is how vessel normalization 
should be combined with an anti-angiogenic treatment. Vessel nor- 
malization was first recognized in mice xenografted with colon cancer 
and treated with an anti- VEGF antibody, but it is transient in mice and 
patients'**”*°, Recent genetic studies in mice have shown that sustained 
vessel normalization can provide benefits. Indeed, haplodeficiency of 
the oxygen-sensor PHD2 in endothelial cells induces sustained nor- 
malization of tumour vessels, without altering vessel density or size”. In 
these vessels, leakage, tortuosity and remodelling are reduced, whereas 
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endothelial cell quiescence, barrier tightening and vessel maturation 
are increased — changes that boost perfusion and decrease hypoxia”. 
A streamlined monolayer of phalanx endothelial cells is also formed, 
providing a more impenetrable barrier for intravasating tumour cells”. 
These changes do not affect tumour growth, but reduce tumour cell inva- 
siveness, intravasation and metastasis””. Although these genetic studies 
offer an elegant example, the challenge will be to develop therapeutic 
strategies that translate these insights into daily practice in the clinic. 


Future directions 

An important question is how anti-angiogenic medicine can be 
improved. In the short term, the use of current anti-VEGF agents should 
be optimized. Given the low response rates, a step forward would be 
the discovery of predictive biomarkers to identify responders among 
the large patient group of non-responders. So far, only a few candidates 
for predictive biomarkers have been identified, but they emerged from 
small studies and require prospective validation in independent rand- 
omized trials’*. Another consideration is the optimization of the dose 
and duration of anti-angiogenic drug delivery. Little is understood about 
the mechanisms of vascularization of micrometastatic lesions, and 
agents that can block other modes of tumour vascularization (such as 
co-option, intussusception, vasculogenesis and vasculogenic mimicry) 
are needed. Furthermore, understanding the mechanistic differences 
between VEGEFR TKIs and anti- VEGF antibodies (for instance, whether 
the former are effective as monotherapy because they inhibit several 
targets, whereas the latter require combination chemotherapy in most 
instances) will help to optimize the design of anticancer treatments. 

In the intermediate term, anti- VEGF agents could be combined with 
agents that target the escape pathways detected in clinical studies (not 
in mice). Examples are ANG-2, PIGF, SDF-1a and CXCR4 (ref. 14). The 
challenge will be when to add these second agents — before, during or 
after anti-VEGF therapy. In the long term, the therapeutic potential 
of vascular normalization agents based on recently identified targets 
should be evaluated in preclinical models, but their clinical develop- 
ment will require years. By using combinatorial therapeutic approaches, 
it will be important to explore the eradication of most tumour vessels 
and normalization of residual vessels for longer durations than are now 
achievable with VEGF blockers alone. The potential of tumour ves- 
sel normalization to improve anticancer immune therapy should be 
explored further. Finally, it is important to test whether the approved 
anti-VEGF agents and those under development could be used to treat 
various non-malignant diseases characterized by abnormal vascu- 
lature, which afflict millions of people worldwide and in many cases 
have no effective treatment — such as age-related macular degen- 
eration causing blindness, schwannomas causing loss of hearing, and 
atherosclerotic plaques causing stroke and myocardial infarction after 
rupture’*”’. A tight integration between preclinical and clinical research 
is crucial to achieve these goals. m 
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Lessons on the pathogenesis of aneurysm 
from heritable conditions 


Mark E. Lindsay!” & Harry C. Dietz?” 


Aortic aneurysm is common, accounting for 1-2% of all deaths in industrialized countries. Early theories of the causes 
of human aneurysm mostly focused on inherited or acquired defects in components of the extracellular matrix in the 
aorta. Although several mutations in the genes encoding extracellular matrix proteins have been recognized, more recent 
discoveries have shown important perturbations in cytokine signalling cascades and intracellular components of the 
smooth muscle contractile apparatus. The modelling of single-gene heritable aneurysm disorders in mice has shown 
unexpected involvement of the transforming growth factor- cytokine pathway in aortic aneurysm, highlighting the 


potential for new therapeutic strategies. 


neurysm or arterial enlargement is the gross phenotype that 

manifests progressive organ failure of large arteries, including 

the aorta. Aortic aneurysm is usually not inherently dangerous, 
but enlarged arteries show a predisposition for tear (known as dissec- 
tion), with high mortality rates. Clinicians recognize two predominant 
spatial distributions of aortic aneurysms in patient groups. The most 
common form, abdominal aortic aneurysm (AAA), is typically associ- 
ated with advanced age and atherosclerosis, with attendant risk factors 
such as hypercholesterolaemia, hypertension and/or diabetes. These 
lesions are pathologically characterized by atheromata, invasion of 
inflammatory cells, destructive extracellular matrix remodelling, and 
depletion and dysfunction of vascular smooth muscle cells (VSMCs). 
Although it is clear that genetic determinants influence the development 
of AAA, there has been no description of a single major gene or locus 
effect that is sufficient to cause isolated abdominal aneurysm (that is, in 
the absence of evidence for a more systemic vasculopathy). The weight 
of evidence therefore indicates that AAA is a complex disorder that 
integrates the influence of predisposing genes with lifestyle-associated 
risk factors — analogous to coronary artery disease. 

The second common site for aneurysm is the thoracic ascending 
aorta. Unlike AAA, thoracic aortic aneurysm (TAA) occurs in all 
age groups, is more highly associated with hereditary influences and 
does not show obligate association with cardiovascular risk factors. 
Pathologically, inherited forms of TAA typically show destructive matrix 
remodelling with elastin fragmentation, proliferation of VSMCs anda 
less prominent inflammatory component without atheromata. Many 
presentations of TAA show classic Mendelian inheritance with high 
or complete penetrance, suggesting the major contribution of a single 
gene. Familial TAA can be subdivided into syndromic presentations that 
show prominent features of a systemic connective tissue disorder (such 
as Marfan syndrome (MEFS) and Loeys- Dietz syndrome (LDS)) and non- 
syndromic presentations (such as bicommissural aortic valve with TAA, 
and isolated familial TAA). Both the syndromic and non-syndromic 
groups include many disorders in which disease predominates in the very 
proximal ascending thoracic aorta (Fig. 1). In fact, there are only rare 
exceptions in which the ascending aorta is infrequently involved — such 
as vascular Ehlers—Danlos syndrome. Table 1 includes a list of human and 
mouse single-gene disorders associated with aneurysm predisposition. 

Over the past couple of decades, the genes responsible for heritable 
aortic aneurysm have been identified at an accelerating pace. Gene 
identification has allowed the creation of mouse models of inherited 


aortic aneurysm, providing the first opportunity to temporally and 
comprehensively interrogate the pathogenic sequence of aneurysm, 
extending from predisposition to clinical consequence, in an experimental 
context that mimics the physiological complexity of the human system. 
This combination of human molecular genetics and animal modelling 
has shown the involvement of diverse cytokine pathways, prominently 
the role of the transforming growth factor-8 (TGF-B) pathway in aortic 
aneurysm. This Review focuses on the mechanisms of failure of large 
vessel wall homeostasis that challenge or inform historical perspectives, 
attempts to integrate emerging models of disease, and discusses the 
remaining challenges and opportunities in aneurysm research. 


Excessive focus on elastin in TAA pathogenesis 

There has been disproportionate historical focus on elastic fibres in 
pathogenetic models of inherited TAA. This derives from the near- 
uniform histological observation of reduced elastin content and elastic 
fibre fragmentation in the aortic media (the middle aortic layer), known 
as cystic medial necrosis. However, many elastin-deficiency states do not 
associate with aneurysm as a prominent phenotype. Although aortic 
aneurysm is an extremely rare manifestation of cutis laxa syndromes 
caused by mutations in the elastin gene’, it is not observed in mice or 
humans with dominant and recessive forms of cutis laxa caused by defi- 
ciency of fibulin-5 (ref. 2), a crucial mediator of elastogenesis. By contrast, 
aneurysmal disease, including prominent involvement of the ascend- 
ing aorta, is highly penetrant in inherited cutis laxa caused by fibulin-4 
deficiency’. Humans and mice with a fibrillin-1 deficiency show failed 
elastic fibre homeostasis and highly penetrant aortic root aneurysms in 
the context of MFS (see ref. 4 and references therein). Pseudoxanthoma 
elasticum caused by ABCC6 deficiency in humans also shows postnatally 
acquired elastic fibre fragmentation in the aorta, but does not typically 
show aneurysm’. These observations raise the important question of 
which molecular events, besides elastin-related issues, are common to 
the fibrillin-1- and fibulin-4-deficiency states but not recapitulated by 
fibulin-5 deficiency. The focus on the aortic media and elastic fibres may 
have been a distraction, and further insight could come from considering 
other distinguishing features of the proximal ascending aorta, including 
developmental cellular ontology. Genes mutated in MFS and vascular 
Ehlers—Danlos syndrome (FBNI and COL3A1, respectively) encode 
extracellular matrix elements (fibrillin-1 and collagen a-1(III), respec- 
tively). This discovery led to the generation of pathogenetic models that 
singularly invoke inherent structural weakness of the tissues. Such a view 


‘Division of Pediatric Cardiology, Department of Pediatrics, Johns Hopkins Medical Institutions, Baltimore, Maryland 21205-1832, USA. 7McKusick-Nathans Institute of Genetic Medicine, Johns 
Hopkins Medical Institutions, Baltimore, Maryland 21205-1832, USA. “Howard Hughes Medical Institute, Baltimore, Maryland, 21205-1832, USA. 


308 | NATURE | VOL 473 | 19 MAY 2011 


© 2011 Macmillan Publishers Limited. All rights reserved 


boded poorly for the development of productive medical treatment strat- 
egies, requiring a means to alter the structural composition of inherently 
weak tissues. Fortunately, for both patients and researchers, the story 
of hereditary aneurysm has turned out to be much more complex and 
potentially permissive for therapeutic intervention. 


MFS provides a link between aneurysm and TGF-f 
A shift in thinking about the pathogenesis of aortic aneurysm occurred 
during the study of MFS*. Although many of the clinical manifestations 
of MES could be caused by simple tissue weakness imposed by 
fibrillin-1 deficiency (such as aortic aneurysm, eye lens dislocation and 
emphysema), others were not so easily reconciled (bone overgrowth, 
craniofacial alterations and myxomatous valve disease). Insight came 
from studying lung disease in fibrillin-1-deficient mice. Contrary 
to expectation, mouse models of MFS did not show destructive and 
inflammatory emphysema. Instead, they showed primary failure of distal 
alveolar septation during late embryogenesis and the perinatal period’. 
Mechanistic hypotheses built on the observation that fibrillin 
proteins show marked homology to latent TGF-B-binding proteins 
(LTBPs). Most TGF- is secreted from cells in the context of a large 
latent complex (LLC) that includes the mature cytokine, a dimer of its 
processed amino-terminal propeptide (latency-associated peptide) and 
one of three LTBP isoforms (LTBP-1, LTBP-3 or LTBP-4). LTBPs target 
the TGF-6 LLC to binding partners such as fibronectin and microfibrils 
composed of fibrillin-1 — an event that is thought to regulate TGF-B 
bioavailability and activity by controlling access to, or the efficiency of, 
TGF-6 activators. One hypothesis was that failed or improper matrix 
sequestration of the LLC owing to fibrillin-1 deficiency could lead 
to promiscuous TGF-f activation. In keeping with this concept, the 
developing lungs of fibrillin-1-deficient mice showed decreased LLC 
levels but raised levels of free TGF-6 and increased TGF-6 signalling, 
as demonstrated by the nuclear translocation and phosphorylation of 
receptor-activated SMAD proteins 2 and 3 (pSMAD2/3). Notably, distal 
alveolar septation could be restored in fibrillin-1-deficient mice by the 
administration of a pan-specific anti- TGF-f neutralizing antibody’. 
Other manifestations of MFS, including myxomatous mitral valve 
disease, skeletal myopathy and, most importantly, aortic root aneurysm, 
were associated with increased TGF-6 signalling in mouse models of 
MES, and were attenuated or prevented by TGF-B antagonism with a 


neutralizing anti- TGF-f antibody in these mice in vivo*”. 


TGF-£ receptor mutations 

Perhaps the most direct evidence ofa major role for TGF-6 in aneurysm 
pathogenesis came from the finding that mutations in the TGFBR1 
and TGFBR2 genes — which encode the TGF-f receptor subunits 
TGFR-1 (also known as ALK-5) and TGFR-2, respectively — result 
in aneurysm conditions that have undeniable phenotypic overlap 
with MFS, a notable example of which is LDS*” (Table 1). Similar to 
MFS, patients with LDS typically show skeletal involvement, including 
long fingers, chest wall deformity and scoliosis. Other shared features 
include widening of the dural sac, skin stretch marks and mitral valve 
prolapse. Patients with LDS typically do not show lens dislocation, 
and can show many discriminating systemic features such as widely 
spaced eyes (hypertelorism), cleft palate or bifid uvula, cervical spine 
malformation or instability, osteoporosis and club foot deformity*”. 
Most importantly, patients with LDS show highly penetrant arterial 
tortuosity (an elongation of an artery resulting in a twisted course) and 
a strong predisposition for aneurysm and dissection throughout the 
arterial tree. Vascular disease in patients with LDS is more aggressive 
than in those with MFS, with rupture at a younger age (as young as 
6 months) and at smaller aortic dimensions"”. 

Nearly all patients with LDS are heterozygous for missense 
substitutions in the kinase domain of TGFR-1 or TGFR-2. Recombinant 
expression of mutant receptors in cells naive for the corresponding 
receptor subunit failed to support TGF-f signalling, leading to the 
hypothesis that haploinsufficiency was the relevant mechanism’. 
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Figure 1 | Sites of TAA in transforming growth factor-6 vasculopathy 
syndromes. Multidetector computer tomographic reconstruction images 
are shown. a, The thoracic aorta of a 9-year-old patient with LDS after 
emergent repair of a proximal descending (Stanford type B) aortic dissection, 
with a large unrepaired aortic root aneurysm (yellow arrowhead). The red 
arrowhead denotes the dacron tube graft. b, The thoracic aorta of a 16-year- 
old patient with LDS after surgical repair of an aortic root aneurysm (yellow 
arrowhead), with a discrete fusiform aneurysm of the proximal descending 
aorta (red arrowhead). Scale bars, 3 cm. 


If this were the case, however, the identification of early nonsense 
mutations that elicit messenger RNA clearance through nonsense- 
mediated mRNA decay or whole allele deletions might be expected. 
Instead, the skewed mutational repertoire seems to manifest selection 
for receptor variants that traffic to the cell surface and bind ligand but 
lack the ability to propagate signal. Co-transfection studies in human 
cells using equimolar concentrations of normal and mutant receptors 
and experiments with heterozygous patient cells have shown apparent 
preservation of the signalling potential of wild-type receptor subunits, 
excluding a conventional dominant-negative mechanism*”. Notably, 
patient vascular tissue obtained at surgery or autopsy has consistently 
shown paradoxically enhanced TGF-f signalling, demonstrated by 
nuclear accumulation of pSMAD2 in VSMCs and increased output of 
TGF-B-driven gene products such as collagens and connective tissue 
growth factor (CTGF)*"”. The architectural changes seen in the aortic 
wall of patients with LDS are highly reminiscent of those seen in MFS 
and other inherited forms of aortic root aneurysm. The mechanisms 
underlying this paradoxical effect remain unknown, but potentially 
include altered receptor trafficking, impaired autoregulation of TGF-B 
signalling, alternative signalling cascades or non-autonomous cellular 
events (see ‘Cancer biology as a guide to aneurysm’). Heterozygous loss- 
of-function SMAD3 mutations recapitulate both the LDS phenotype 
and paradoxical enhancement of TGF-f signalling in the aortic wall". 


Other TGF-f links to aneurysm 

Several other aneurysmal disorders have been linked to TGF-f signal- 
ling’. High levels of TGF-£ signalling, as assessed by nuclear pSMAD2 
accumulation, have been observed in surgical samples from patients 
with diverse aneurysm conditions such as isolated familial TAA and 
bicommissural aortic valve with TAA’. Patients with autosomal recessive 
arterial tortuosity syndrome show diffuse and severe arterial tortuosity 
that is often associated with vascular stenoses, segmental vascular hypo- 
plasia and arterial aneurysms, caused by loss-of-function mutations in 
the SLC2A 10 (also known as GLUT 10) gene, which encodes the integral 
membrane protein glucose transporter type 10 (ref. 12). Vascular tissue 
from patients with arterial tortuosity syndrome shows the same signature 
of high TGF-6 signalling seen in tissue from patients with MFS or LDS. 
Although the mechanism is poorly understood, cultured fibroblasts from 
patients with arterial tortuosity syndrome show impaired expression of 
the proteoglycan decorin — an antagonist of TGF-f superfamily signal- 
ling’. Deficiency of the extracellular protein fibulin-4 causes autosomal 
recessive cutis laxa in association with arterial tortuosity and aortic aneu- 
rysm in both humans and mice*”’. Curiously, fibulin-5 deficiency causes 
cutis laxa with arterial tortuosity but not aneurysm. Both fibulin-4 and 
fibulin-5 bind to fibrillin-1 and elastin, and deficiency in fibulin-4 or 
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Table 1 | Human hereditary aneurysm conditions and mouse models of aneurysm 


Gene (protein) Human aneurysmal syndrome Animal-model phenotype Pathway 
Extracellular matrix protein 
FBN1 (fibrillin-1) MFS; highly penetrant ascending aortic aneurysm KO: perinatal lethal, pulmonary hypoplasia, arteriopathy; | TGF-B°*°° 
hypomorphic: arteriopathy, aneurysm, dissection and 
systemic manifestations of MFS 
EFEMP2 (fibulin-4) Cutis laxa with aneurysm; ascending aortic aneurysm and KO: ascending aortic aneurysm, defective elastogenesis, TGF-B?8 
tortuosity perinatal lethality 
ELN (elastin) Cutis laxa with aneurysm; low penetrance ascending aortic Haploinsufficient: obstructive arterial disease with Unknown!°° 
aneurysm and dissection increased VSMC proliferation, increased lamellae number; 
KO: accentuated phenotype 
COL1A1 Osteogenesis imperfecta; extremely rare aortic aneurysm; KO: adult-onset aortic aneurysm and dissection Collagen 
(collagen a-1(1)) EDS, type 7A; dissection of medium-sized arteries metabolism®”°8 
COL1A2 Osteogenesis imperfecta; extremely rare aortic aneurysm; Homozygous LOF: decreased body weight, bony Collagen 
(collagen a-2(l)) EDS, cardiac valvular dystrophy type 7B; borderline aortic abnormalities, no arterial phenotype reported metabolism*? 
root enlargement with aortic regurgitation 
COL3A1 EDS, type 4; frequent arterial dissection with infrequent KO: frequent neonatal mortality, aortic rupture, intestinal Collagen 
(collagen a-1(III)) aneurysm rupture metabolism! 
COL4A1 Hereditary angiopathy, nephropathy, aneurysms and muscle KO: embryonic lethal (E10.5-11.5), basement membrane Collagen 
(collagen a-1(IV)) cramps; infrequent aneurysms failure metabolism®**? 
COL4A5 X-linked Alport syndrome; ascending aortic and abdominal Nonsense mutation: no overt aortic disease noted Collagen 
(collagen a-5(IV)) aneurysms and dissections metabolism 
LOX (lysyl oxidase) No human phenotype described KO: low penetrance aortic aneurysm, perinatal lethality Collagen 
metabolism; 
TGF-B*° 
PLOD1 (lysyl EDS, type 6; rare aneurysm KO: spontaneous aneurysm and dissection, gait Collagen 
hydroxylase 1) abnormalities metabolism 
PLOD3 (lysyl Bone fragility with contractures, arterial rupture and KO: embryonic lethal (E9.5) and basement membrane Collagen 
hydroxlase 3) deafness; frequent medium-sized arterial aneurysms fracture metabolism®” 


Transmembrane protein 


TGFBR1 (TGF-B 
receptor type 1) 


LDS; highly penetrant root and diffuse large and medium 
arterial aneurysms 


TGFBR2 (TGF-B 
receptor type 2) 


LDS; highly penetrant root and diffuse large and medium 
arterial aneurysms; familial thoracic aortic aneurysms and 
dissections; highly penetrant root and medium arterial 
aneurysms 


ENG (endoglin) Hereditary haemorrhagic telangiectasia; incompletely 


penetrant aortic and medium-sized arterial aneurysms 


ACVRL1 (activin 
receptor-like kinase |) 


Hereditary haemorrhagic telangiectasia; incompletely 
penetrant aortic and medium-sized arterial aneurysms 


SLC2A10 (glucose 
transporter type 10) 


NOTCH1 (NOTCH1) 


Arterial tortuosity syndrome; diffuse arterial tortuosity, 
stenoses, aneurysms 


Bicuspid valve with ascending aortic aneurysm 


JAG1 (JAGGED 1) 
aorta, aortic aneurysm 


GJA1 (connexin-43) Hypoplastic left heart syndrome (HLHS) 


Alagille syndrome; intracranial aneurysms, coarctation of the 


KO: midgestational death with yolk sac defects; M318R TGF-BS6? 

heterozygous knock-in: aortic root and diffuse aneurysm 

(D. Loch, unpublished observations) 

KO: defects in haematopoesis and vasculogenesis, TGF-p8!97° 

embryonic lethal (E10.5); Tgfbr2"™: impaired 

elastogenesis, decreased lysyl oxidase in aorta; G357W 

heterozygous knock-in: aortic root and diffuse aneurysm 

(D. Loch, unpublished observations) 

KO: defective vasculogenesis, embryonic lethal (E10); TGF-B 

haploinsufficient: haemorrhagic telangiectasia causing superfamily’)72 

strokes, fatal haemorrhage and heart failure 

KO: defective vasculogenesis, embryonic lethal, TGF-B 

excessive fusion of capillary plexuses; haploinsufficient: superfamily’*”* 

haemorrhagic telangiectasia 

Homozygote missense: arterial thickening with increased TGF-B127° 

elastin deposition, elastin fractures at advanced age 

KO: embryonic lethal (E9.5), required for somite NOTCH1- 

segmentation, defects in angiogenesis JAGGED17°”” 

KO: embryonic lethal (E9.5) with diffuse haemorrhages NOTCH1- 
JAGGED17°’9 

Nonsense mutation (W45X): coronary artery aneurysms Unknown®? 


fibulin-5 is associated with profound failure of elastogenesis — the prob- 
able cause of arterial tortuosity. Fibulin-5 mainly promotes elastin fibre 
assembly by the recruitment of tropoelastin to microfibrils”. By contrast, 
fibulin-4 is needed for the recruitment of lysyl oxidase (LOX), a copper- 
dependent enzyme that catalyses crosslinking of elastin molecules”. In 
keeping with these findings, LOX-deficient mice show severely disrupted 
aortic laminae, arterial tortuosity and low penetrance aneurysm””. Mice 
and humans deficient in fibulin-4 show increased TGF-f signalling in 
the vessel wall, which may be directly related to the ability of LOX to 
inhibit TGF-B enzymatically’®. In this light, it seems that the aneurysm 
phenotype specific to fibulin-4 deficiency manifests a loss of a function 
other than elastin crosslinking, plausibly including TGF-f repression. 
Despite several lines of evidence invoking high TGF-f signalling in 
aneurysm, conflicting observations exist. For example, high vascular 
TGFE-6 signalling was shown in Emilin1-deficient mice in association 
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with a diffusely small vascular system and hypertension in juvenile 
animals’”. The developmental timing and vascular distribution of high 
TGE-6 signalling were not assessed, and there was no comprehensive 
analysis for aneurysms””. Another study has shown that lineage-specific 
ablation of TGF-6 signalling in VSMCs in the ascending aorta results 
in perturbations of vascular morphogenesis in fetal mice, including 
persistent truncus arteriosus, impaired elastogenesis and apparent vessel 
widening that was equated with aneurysm”; similar impairment of 
elastogenesis was seen only in the descending aorta of mice with global 
Tgfbr2 deletion in VSMCs”. There is further indirect evidence that low 
TGF-6 signalling may also be involved in developmental presentations 
of aneurysm in MFS. An emerging view is that the fibrillin proteins 
have a dichotomous role in TGF-f regulation. Fibrillin-2 was shown to 
concentrate TGF-6 ligands (prominently BMP7), and this is required 
to support morphogenetic events at sites of intended function. In the 
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Gene (protein) Human aneurysmal syndrome Animal-model phenotype Pathway 

Transmembrane protein cont. 

PKD1 (polycystin-1) Polycystic kidney disease with intracranial aneurysms KO: embryonic lethal (E14.5) with polycystic kidneys; mTOR®8? 
hypomorphic expression: adult-onset aortic aneurysm and 
dissection 

PKD2 (polycystin-2) Polycystic kidney disease with intracranial aneurysms KO: defects in cardiac septation and left-right axis Unknown*®? 
determination, kidney and pancreatic cysts 

Cytoplasmic protein 

SMAD3 (SMAD LDS; aortic aneurysm with osteoarthritis KO: metastatic colorectal cancer TGF-p! 


family member 3) 


ACTA2 (a-smooth 
muscle actin) 


MYH11 (smooth 
muscle myosin) 


FLA (filamin-A) 


Familial aortic aneurysm with livedo reticularis and iris 
flocculi 


Familial aortic aneurysm with patent ductus arteriosus 


Periventricular nodular heterotopia with EDS features; 
ascending aortic aneurysm and valvular dystrophy 


NF1 Neurofibromatosis; medium-sized arterial aneurysm and 
(neurofibromin-1) stenosis 

PTPN11 Noonan and LEOPARD syndromes; coronary artery 
(protein-tyrosine aneurysms and rare ascending aortic aneurysm 
phosphatase 2C) 

NPHP3 Nephronophthisis 

(nephrocystin-3) 

NOS3 (nitric oxide Refractory hypertension 

synthase 3) 

TSC2 (tuberin) Tuberous sclerosis; diffuse thoracoabdominal aneurysms 
GAA (lysosomal Acid maltase deficiency, adult onset; intracranial aneurysms 
a-glucosidase) 

§100A12 (S100A12) No human phenotype; increased S100A12 protein 


expression in human MYH1 1-mutation aneurysmal tissues 


KO: viable offspring with normal lifespan and impaired 
vascular contractility 


GF-1, Ang |I?8% 


KO: neonatal lethality, urinary retention, dilated GF-1, 
cardiomyopathy Ang 11363785 
KO: neonatal lethality, persistent truncus arteriosus, Unknown*°® 


endothelial cell-cell contact defects 


KO: enlarged head, pale liver, cardiac malformations 


Ras-MEK-ERK*? 


KO: embryos die at preimplantation; missense mutation Ras-MEK- 
(D61G): cardiac defects, defective valvulogenesis, skeletal ERK298755 
anomalies, myeloproliferative disorder 

KO: low penetrance intracranial aneurysms Unknown®? 
KO: abnormal aortic development with bicuspid aortic itric oxide??? 
valve; in combination with Apoe”, mice show abdominal 

arterial aneurysm and dissections 

Heterozygous KO: increased proliferation of VSMCs after © mTOR? 
injury 

KO: lysosomal accumulation in heart, aorta, skeletal Unknown?? 
muscle 

Sm22a promoter-S100A12 transgenic mouse: IL-6, TGF-B°? 


vascular smooth muscle disarray, elastin fragmentation, 
thoracic aneurysm 


Nuclear protein 


MED 12 (mediator Lujan-Fryns syndrome; extremely rare aneurysm 


Hypomorphic mutants: embryonic lethal (E10), defects in 


WNT-f-catenin, 


complex subunit 12) neural tube closure, somatogenesis, heart formation WNT-PCP™ 
KLF15 (Kruppel-like No human phenotype; Kriippel-like factor 15 downregulated KO: aortic aneurysm and cardiomyopathy TSP-1, p53, 
factor 15) in human abdominal aortic aneurysm TGF-B”° 
KLF2 (Kruppel-like | No human phenotype KO: embryonic aortic aneurysm and dissection Unknown’? 
factor 2) 
Chromosomal anomaly 
45,X Turner syndrome; bicuspid aortic valve, coarctation of XO mice (with a single X chromosome): no phenotypic Unknown’ 
the aorta, ascending aneurysm heart disease 
Chemical model 
No human phenotype Ang-ll-infusion model Ang Il, MCP-1, 
IL-6, TGF-B°S 
No human phenotype Elastase-infusion model Unknown?” 


No human phenotype 


Periarterial calcium application 


JNK] (ref. 100) 


E10.5, embryonic day 10.5; EDS, Ehlers-Danlos syndrome; KO, knockout; LOF, loss of function; mTOR, mammalian target of rapamycin; PCP, planar cell polarity. PTPN11 is also known as SHP2. 


developing autopod, fibrillin-2-deficient mice show BMP7 deficiency 
and recapitulate the syndactyly phenotype observed in BMP7-targeted 
mice”. Mice deficient in both fibrillin-1 and fibrillin-2 show persistent 
truncus arteriosus, which historically has been associated with loss of 
TGFE-6 signalling (E Ramirez, personal communication). 


Downstream of TGF-f 

Little is known about the precise pathogenetic sequence downstream 
of TGF-6 that is involved in aneurysm progression. Enhancement of 
matrix metalloproteinase (MMP) activity is frequently invoked. Such 
a model is both theoretically appealing and experimentally validated. 
Evidence includes high levels of MMP expression and activity in many 
natural and experimentally induced presentations of aneurysm, and the 
ability of MMP inhibitors (such as doxycycline) to attenuate aneurysm 
progression, including in MFS mouse models”. Although TGF-f has 


been associated with reduced expression and activity of several MMPs 
in many tissues and contexts, it has been shown to specifically induce 
MMP2 and MMP9 expression — the MMPs that are most closely associ- 
ated with aneurysm conditions such as MFS””’. 

Most studies of TGF-B-related disease states have focused on 
‘canonical’ (SMAD-dependent) signalling cascades, with a more historic 
than empirical basis for such an emphasis. More recently, it has been 
shown that ligand-activated TGF-f receptors can stimulate signalling 
through non-canonical pathways, including the phosphatidylinositol- 
3-OH kinase (PI(3)K)/AKT cascade, the Rho-associated protein kinase 
(ROCK) cascade and the mitogen-activated protein kinase (MAPK) 
cascade™ (Fig. 2). Although mouse and human models of MFS 
demonstrate upregulation of canonical signalling’, its importance is 
not clear. There is emerging evidence that MAPKs may have a role in 
aneurysm. For example, activation of p38 has been observed in the aorta 
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of young mice homozygous for a hypomorphic fibrillin-1 (Fbn1) allele”. 
We have also observed TGF-B- and angiotensin II type 1 receptor 
(AT1R)-dependent activation of the extracellular-signal regulated 
kinases (ERK1 and ERK2) in the aorta of fibrillin-1-deficient mice, 
and abrogation of pathological aortic root growth after treatment with 
a specific ERK inhibitor’. ERK] upregulation has also been observed 
in fibulin-4 deficiency in mice and humans*”, perhaps providing a link 
between the loss of function of fibulin-4 and fibrillin-1. 

Infusion of angiotensin II (Ang II) or the application of CaCl, 
promotes AAA in mice. Antagonism of c-Jun N-terminal protein 
kinase 1 (JNK1) signalling has been shown to attenuate disease 
progression in an AAA mouse model”. This occurred in association 
with reduced MMP2 and MMP%9 activity and enhanced LOX expression. 
ERK activity is known to be instrumental in diverse aspects of vascular 
pathology, including VSMC proliferation and migration, and has 
been linked to TGF-B-mediated MMP upregulation and epithelial-to- 
mesenchymal transition. Finally, low penetrance aneurysm has been 
observed in human conditions known to modify Ras signalling, a major 
upstream activator of ERK. These include gain-of-function mutations 
in the PTPN11 gene, which encodes a protein tyrosine phosphatase, 
leading to Noonan syndrome, and loss-of-function mutations in the 
NF1 gene, which encodes a Ras GTPase-activating protein, leading to 
neurofibromatosis type 1 (refs 29, 30) (Table 1 and Fig. 2). 


Angiotensin II and aneurysm 
Aortic aneurysm and dissection can be modelled through the infusion 
of Ang II in mice deficient for apolipoprotein E (Apoe“) (ref. 4 and 
references therein), or with higher doses in aged wild-type mice”. 
Aneurysm formation occurs with high penetrance in the suprarenal 
abdominal aorta; the ascending aorta can also be involved with lower 
frequency and severity. Aneurysm has been shown to be independent 
of hypercholesterolaemia and hypertension in this model, but requires 
intact AT1R signalling, innate immunity and MMP activity’. 

The increased expression of monocyte chemoattractant protein-1 
(MCP-1), its receptor (CCR2) and interleukin-6 (IL-6) have been 
demonstrated in models of Ang-I-induced aneurysm, and CCR2 
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Figure 2 | The TGF-B and Ang II 
signalling pathways. Signalling 
cascades of TGF-f and Ang II have 
been implicated in the pathogenesis 

of aneurysm. Fibrillin-1, the major 
component of extracellular microfibrils, 
binds and sequesters the large latent 
complex (LLC) of TGF-f. After TGF-B 
activation (release), ligand binds to the 
TGF-B receptor (TGFR) and activates 
both canonical (grey) and non- 
canonical (blue) signalling cascades. The 
extensive crosstalk between the TGF-B 
and Ang II type 1 receptor (ATIR) 
signalling pathways is indicated. Key 
terminal events in the pathogenesis 

of aneurysm may include MMP- 
mediated proteolysis, CTGF-mediated 
epithelial-to-mesenchymal transition 
and tissue remodelling, or IL-6- and 
MCP-1-mediated inflammation. 
Proteins indicated in purple have been 
directly implicated in human hereditary 
aneurysmal disease (see Table 1). MAP3K7, 
mitogen-activated protein kinase kinase 
kinase 7 (also known as TAK1); MEK1, 
MAP kinase kinase 1; MLCK, myosin 
light chain kinase; MLCP, myosin light 
chain phosphatase; p190 RhoGAP, Rho 
GTPase-activating protein 5; SHP2, 
protein tyrosine phosphatase 2C; a-SMA, 
a-smooth muscle actin. 
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signalling has been shown to contribute to IL-6 expression. Ang II 
infusion in mice was shown to associate with accumulation of CCR2* 
macrophages in the vascular adventitia — the outermost vessel tissue 
layer — specifically at the sites of aneurysm formation and most 
prominently at the sites of dissection*'. Mice lacking CCR2 protein 
showed reduced macrophage accumulation, decreased IL-6 and MCP-1 
expression, and protection from dissection in response to Ang IT infusion. 
In vitro modelling has demonstrated that monocytes co-cultured with 
adventitial fibroblasts upregulate IL-6 and MCP-1, and show enhanced 
differentiation into macrophages. Although the activity of a fibroblast- 
derived paracrine factor was suggested, there has been no speculation 
about its identity. The potent anti-inflammatory cytokine TGF- is a 
promising candidate. TGF-B is known to induce the expression of both 
IL-6 and MCP-1 in many cell types, including fibroblasts and VSMCs, 
and can positively regulate monocyte recruitment and macrophage 
differentiation*. Increased MCP-1 expression was also proposed as 
a determinant of disease in response to JNK1 signalling, with JNK1 
suppressing the expression of LOX that normally negatively regulates 
MCP-1 (ref. 32). It is notable that although the loss of CCR2 or IL-6 
expression prevented early dissection in response to acute Ang II infusion 
in mice, it did not preclude dissection after chronic infusion. Increased 
adventitial IL-6 expression was observed in human aneurysms, but only 
at sites of dissection. In this light, it seems that the described IL-6 and 
MCP-1 amplification loop contributes to, but is not required for, Ang-II- 
induced aneurysm and dissection in mice, and more work needs to be 
done to determine its contribution to disease pathogenesis in humans. 
Attempts to integrate TGF-f signalling into the pathogenesis of 
Ang-I-induced aneurysm models are frustrated by limited and 
contradictory empirical knowledge. As previously mentioned, Ang II 
signalling through AT 1R has the capacity to enhance TGF- signalling 
by inducing the expression of ligands, receptors and activators. It 
has also been reported that Ang II can activate the intracellular 
SMAD signalling cascade in VSMCs, in a TGF-B-independent 
manner” (Fig. 2). Ang II can also regulate MAPK signalling cascades 
independently of TGF-f, with the suggestion that signalling through 
its different receptor subtypes (AT1R and AT2R) can have varying and 
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Figure 3 | TGF-6 signalling in heritable aneurysm syndromes. The 
proposed mechanisms for the amplification of TGF-f signalling in conditions 
such as LDS are shown. a, Potential cell-autonomous mechanism of 
upregulation of TGF-f signalling in LDS. TGF-f ligand usually stimulates 
canonical and non-canonical pathways (left), with feedback inhibition 
provided by canonical signalling. In LDS (centre), TGF- receptor kinase- 
domain mutations (depicted in green) may cause a selective decrease in 
canonical signalling and thus in feedback inhibition. Cell-autonomous 
compensation (that is, increased ligand expression and activation) to 
maintain canonical signalling (right) would result in excessive activation of 
non-canonical signalling cascades. b, The TGF-6 cancer paradox. During 
tumorigenesis progression, tumour cells often lose TGF-f responsiveness as 
a method of escaping TGF-B-mediated cell-cycle arrest. A lack of feedback 


even opposing effects. Conversely, a recent study has shown that the 
resistance to Ang-II-induced aneurysm in normocholesterolaemic 
C57BL/6 mice is disrupted by systemic treatment with a neutralizing 
anti-TGF-B antibody™. Again, these lesions were inflammatory in 
nature, and the incidence of aneurysm and dissection was greatly 
attenuated after monocyte depletion. These observations indicate 
that TGF-B may be protective specifically in the setting of acute and 
intense inflammation. Another study has shown that neutralizing anti- 
TGF-6 antibody provided significant protection from Ang-II-induced 
inflammatory aneurysms after targeted silencing of CXKCL10, a known 
chemoattractant for monocytes and macrophages”. Taken together, 
these data suggest that TGF- has biphasic and discordant roles in the 
pathogenesis of mouse Ang-I-induced aneurysm, and that TGF-B 
antagonism can be protective in a context-dependent manner. 


Smooth muscle cytoskeletal elements and aneurysm 

Studies with MFS mouse models have shown that phenotypic changes 
in aortic VSMCs precede elastolysis and gross medial remodelling”. 
These changes include adoption of a general ‘synthetic (as opposed 
to ‘contractile’) character, and morphological changes consistent 
with cytoskeletal rearrangement. More recent work identifying 
genes associated with isolated familial TAA has directly implicated 
perturbation of the contractile apparatus in the pathogenesis of 
aneurysm. Heterozygous mutations in MYH11, which encodes smooth 
muscle myosin heavy chain 11 (MYH11), cause non-syndromic 
ascending aortic aneurysm in association with patent ductus arteriosus 
and rare incidence of bicuspid aortic valve**”’”. By contrast, heterozygous 


inhibition results in upregulation of TGF-B expression by tumour cells and 
excessive activation of neighbouring signalling-competent stromal cells, which 
promotes angiogenesis and tumour invasion. c, Potential non-cell-autonomous 
mechanism of upregulation of TGF-6 signalling in vascular disease. Sites 

of developmental field boundaries correspond anatomically to sites of 
predisposition for aneurysm in TGF-B vasculopathy syndromes (the aortic and 
pulmonary roots, the juxtaductal aorta and the suprarenal abdominal aorta). 
Inset shows cellular events thought to occur at the transition between second 
heart field (brown)- and cardiac neural crest (green)-derived VSMCs. A 
relative perturbation of TGF- signalling would have a disproportionate effect 
on the more vulnerable lineage (second heart field), resulting in increased 
ligand expression and excessive TGF-f signalling by adjacent cells of a different 
lineage (cardiac neural crest) with relative preservation of signalling potential. 


mutations in ACTA 2, which encodes a-smooth muscle actin (a-SMA), 
cause a vascular disorder involving high, but incomplete, penetrance 
of ascending aortic aneurysm and dissection, with a lower incidence of 
descending aortic aneurysm and dissection, patent ductus arteriosus 
and bicuspid aortic valve’. Many patients show a purplish discoloration 
of the skin in a network pattern, caused by altered tone in deep dermal 
capillaries (livedo reticularis) and pigmented cysts of the iris (iris 
flocculi). VSMCs isolated from patients with MYH11 mutations showed 
high proliferative rates, and upregulation of insulin-like growth factor 1 
(IGF-1) signalling’ and components of the Ang II signalling cascade. 
Although it has been reported that isolated patient cells do not show 
evidence of increased TGF-8 signalling, the data and experimental 
conditions were not reported. This is particularly important, because 
the high TGF-6 signalling seen in the aorta of humans and/or mice 
with MFS or LDS is not recapitulated in cultured VSMCs, suggesting 
the necessity of tissue-specified contexts. More recently, a study found 
that aortic tissue from a patient with MYH11 mutations showed high 
VSMC expression of the calcium-binding protein $100A12, an event 
previously linked to high TGF-f expression and signalling in mice”. 
In addition to MYH11 and a-SMA, a third cytoskeletal (but non- 
contractile) protein has been implicated in aortic aneurysm. The large 
actin-binding protein filamin-A shows altered expression or function 
in the neurological condition periventricular nodular heterotopia”. 
Encoded by the FLNA gene, on the X chromosome, filamin-A has 
a diverse repertoire of binding partners, making the delineation of 
a specific causal pathway difficult. Moreover, only a small subset of 
filamin-A-deficient women have been reported to show a predisposition 
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for aneurysm, generally in association with other systemic connective 
tissue findings. Mutations in FLNA have independently been implicated 
in myxomatous valve dystrophy, a phenotype commonly seen in 
syndromic aneurysm conditions such as MFS and LDS. Filamin-A can 
function as a positive regulator of TGF-B signalling through modulation 
of RhoA and SMAD protein trafficking, but also contributes to negative 
regulation of ERKs*” (Fig. 2). 

Many aspects of TGF-6 signalling have links to the cytoskeleton, 
prominently including trafficking and activity of TGF-B receptors and 
signalling effectors (reviewed in ref. 24). Filamin-A can bind to receptor- 
activated SMAD proteins, including SMAD2, and filamin-A-deficient 
melanoma cells show impaired TGF-f signalling compared with cells 
transfected with a filamin-A-encoding vector”. The force generated by 
cellular contraction against a resistance imposed by the neighbouring 
matrix has been shown to contribute positively to TGF-f activation 
in an integrin-dependent manner™. Acute disruption of the actin 
cytoskeleton in human mesangial cells using cytochalasin D, for example, 
has been shown to reduce SMAD2 phosphorylation and TGF-B-induced 
collagen a-1(I) (but not a-1(IV)) expression“. Other studies have shown 
that cytoskeletal disruption can induce the expression of TGF-B-driven 
gene products, such as plasminogen activator inhibitor-1 (PAI-1) and 
connective tissue growth factor, in VSMCs independently of TGF-f ligand 
or receptor, and that these effects are at least partly mediated by activation 
of the ROCK and/or MAPK signalling cascades”. It remains impossible to 
determine the chronic consequences of such manipulations in the context 
ofa healthy or diseased tissue. 


Prospects for aneurysm treatment 

Standard medical therapy for aortic aneurysm has revolved around 
blood pressure control to limit aortic wall stress. The implication of 
TGF-6 signalling in the pathogenesis of aortic aneurysm suggested an 
opportunity for more specific therapy. Blockade of Ang II signalling 
through AT1R had previously been shown to limit TGF-6 signalling and 
fibrosis in rodent models of chronic kidney disease. Indeed, in a mouse 
model of MFS, the AT1R blocker losartan prevented progressive aortic 
aneurysm’. Potential mechanisms for aneurysm treatment include the 
prevention of AT1R-induced expression of TGF- ligands, receptors 
and activators such as thrombospondin-1 (TSP-1) or MMPs. Prenatal 
initiation of losartan treatment in MFS mouse models resulted in full 
normalization of aortic root size, aortic root growth rate and aortic wall 
architecture. Importantly, postnatal initiation of therapy in the context of 
established aneurysmal dilatation and medial degeneration also achieved 
full suppression of aortic root growth and productive remodelling of the 
aortic wall, with decreased elastin fragmentation and matrix deposition. 
Several observations pointed to TGF-B antagonism as the relevant 
mechanism. First, the protection achieved by losartan correlated with 
reduced nuclear accumulation of pSMAD2 and reduced expression of 
TGF--driven gene products such as PAI-1 and CTGE Second, other 
agents with comparable blood-pressure-lowering effects that did not 
alter TGF-6 signalling were associated with a small decline in aortic root 
growth rates compared with losartan, and had no effect on aortic wall 
architecture. Third, losartan limited the growth of the aortic root, which 
showed pathological dilatation and increased TGF-f signalling, but had 
no effect on the growth of other aortic segments, which showed neither. 
Losartan also limited aortic root growth in a subset of children with severe 
and rapidly progressive MFS**. Several large and randomized clinical trials 
of losartan in MFS are under way”. 


Embracing paradox 

Despite recent advances in understanding aneurysm pathogenesis, 
many paradoxes remain to be reconciled. For example, it is not clear 
why defects in structural or regulatory matrix elements, signalling mol- 
ecules or contractile proteins culminate in focal aneurysms rather than 
a diffusely fragile and dilated arterial tree. If haemodynamic stress is 
the answer, it remains unclear why lesions occur on both the high- and 
low-pressure (that is, the root of the pulmonary artery) sides of the 
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Figure 4| TGF-6 signalling in LDS aorta. Immunohistochemical staining 
for nuclear pPSMAD2 (a marker of TGF-6 signalling) in the aortic root ofa 
control individual and a patient with LDS. An enlarged view of the LDS aorta 
is also shown (far right). Although TGF-6 signalling is markedly increased 
in LDS, two distinct cell populations are observed in the aortic media, with 
absent (blue nuclei) or strongly positive (brown nuclei) activity. Original 
magnifications, x10 (left two images; from a montage of images spanning the 
entire thickness of the aortic wall) and x60 (far right image). 


circulation. Another paradox is why conditions such as MFS and LDS 
show a high predisposition for focal aneurysm and primary dissec- 
tion in the periductal region of the proximal descending thoracic aorta. 
In such conditions, it is not known why therapies aimed at blunting 
TGF-6 signalling at the level of ligand bioavailability (such as a neu- 
tralizing anti- TGF-B antibody or losartan) or events far downstream 
of TGF-B (such as MMP inhibition with doxycycline) make things 
better, whereas events or manipulations that target the intracellular 
signalling cascade (such as TGFBR1 or TGFBR2 mutations in LDS, or 
the introduction of SMAD4 haploinsufficiency in fibrillin-1-deficient 
mice””’) routinely make things worse. The paradoxical increase in 
TGF-6 signalling observed in patients with LDS could be explained 
if receptor-mediated activation of the canonical and non-canonical 
cascades relies on separable activities of the receptor complex. It has 
been suggested that distinct kinase activities underlie the phosphoryla- 
tion of SMAD proteins and the ShcA adaptor protein — the proximal 
events in the ERK cascade“. Furthermore, TGF-f-mediated activation 
of other MAPKs has been linked to receptor-mediated ubiquitylation of 
the TRAF6 ubiquitin ligase” (Fig. 2). Ifabnormal receptor complexes in 
LDS have relative preservation of non-canonical signalling but feedback 
regulation is disproportionately governed by canonical signalling, then 
compensatory mechanisms, such as increased ligand expression and/or 
activation, would drive excessive non-canonical TGF-f signalling in a 
cell-autonomous manner (Fig. 3a). 


Cancer biology as a guide to aneurysm 

In many respects, there are intriguing parallels to be drawn with the 
TGE-B cancer paradox. TGF-6 can function as a tumour suppressor, 
with prominent roles in the maintenance of cellular differentiation and 
induction of cell-cycle arrest and apoptosis. During early tumorigenesis, 
many tumour types lose responsiveness to TGF-f through biallelic 
loss-of-function mutations in genes that encode the TGF-6 receptors 
or intracellular mediators of signalling (ref. 4 and references therein). 
Attenuation or loss of tumour responsiveness to TGF-6 leads to increased 
signalling by the neighbouring signalling-competent stroma, owing to 
increased TGF-6 ligand expression (Fig. 3b). Consequences include 
impaired tumour surveillance owing to inhibition of adaptive immunity, 
acceleration of tumour growth because of enhanced angiogenesis, 
tumour invasion and metastasis, enhancement of innate immunity 
(mediated, at least in part, through MCP-1) and/or stimulation of 
epithelial-to-mesenchymal transition. TGF-B signalling can be further 
amplified in the tumour microenvironment through enhanced ligand 
expression by recruited inflammatory cells or enhanced TGF-8 
activation as a consequence of increased expression of activators such 
as MMPs. This sequence of events is not simply a function of the cell 
types per se, but more crucially the interface between cell types with a 
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mismatch in their intrinsic capacity to support TGF-B signalling”. 

Could the cancer paradox inform the pathogenesis of aneurysm? The 
first potential link comes from a consideration of cellular ontology in the 
vascular system. VSMCs at the base of the aorta and pulmonary artery are 
derived from specialized cardiogenic mesoderm termed the second heart 
field (Fig. 3c), a finding with possible implications for MFS™. The ascending 
aorta is a chimaera between cells derived from the second heart field and 
the ectodermal cardiac neural crest (CNC), whereas the CNC is the sole 
origin of VSMCs in the more distal ascending aorta and the transverse 
arch. There is an abrupt transition to somatic mesoderm-derived cells in 
the proximal descending thoracic (juxtaductal) aorta and a contribution 
of splanchnic mesoderm to the descending aorta beginning just below the 
diaphragm”. Thus, although there is not a common origin for VSMCs at 
frequent sites of predisposition for aneurysm, an interaction between cells 
of divergent origin exists at each location. Lineage-specific differences in 
intrinsic TGF-6 signalling capacity were addressed in a study comparing 
the performance of ectoderm-derived cells from the aortic arch and 
mesoderm-derived cells from the abdominal aorta of chick embryos”. 
After stimulation with TGF-B1, ectoderm-derived VSMCs showed 
increased DNA synthesis and robust transcriptional activation ofa TGF-B 
reporter allele, whereas mesoderm-derived VSMCs showed little reporter 
activation and growth inhibition. TGF-f responsiveness correlated with 
differences in the glycosylation status of TGFR-2. The authors concluded 
that different SMC populations within a common vessel wall respond 
in lineage-dependent ways to growth factors that govern developmental 
events and that might participate in vascular disease in later life”. 

It seems plausible that cells of one lineage, such as mesodermal cells in 
the aortic root and descending thoracic aorta, would be more sensitive 
to a perturbation of TGF-6 signalling. For example, this could be caused 
bya failure to concentrate cytokine in the setting of fibrillin-1 deficiency 
or by altered signalling capacity owing to heterozygous loss-of-function 
TGF-6 receptor mutations. Loss of TGF-B feedback would initiate 
compensatory events such as increased TGF-6 ligand expression, 
which could, in turn, stimulate neighbouring cells that can tolerate 
the primary insult better owing to improved reserves in their inherent 
signalling capacity (such as CNC-derived VSMCs). Indeed, inspection 
of the aortic wall from aneurysm tissue obtained at surgery shows 
an apparent binary status of medial VSMCs for TGF- signalling (as 
determined by staining for nuclear pSMAD2), with neighbouring cells 
showing either strong or absent activity (Fig. 4). Such a model not only 
accommodates but also mandates a mechanism for impaired TGF-B 
signalling, and would explain how compensatory events could lead to 
functional overshoot. It would also explain why aneurysms occur at the 
margins of the CNC developmental field, but not typically in its middle 
(Figs 1 and 3c). In such a model, manipulations that limit TGF- ligand 
bioavailability or block terminal pathogenetic events would be effective 
at preventing aneurysm, whereas those that accentuate the signalling 
imbalance would prove detrimental. Despite challenges, the effort to 
refine mechanistic understanding is justified by the high probability that 
these insights will yield treatment strategies for aneurysms and perhaps 
other clinical states associated with impaired vessel wall homeostasis. = 
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Progress and challenges in translating 
the biology of atherosclerosis 


Peter Libby’, Paul M Ridker’” & Géran K. Hansson’ 


Atherosclerosis is a chronic disease of the arterial wall, and a leading cause of death and loss of productive life years 
worldwide. Research into the disease has led to many compelling hypotheses about the pathophysiology of atherosclerotic 
lesion formation and of complications such as myocardial infarction and stroke. Yet, despite these advances, we still lack 
definitive evidence to show that processes such as lipoprotein oxidation, inflammation and immunity have a crucial 
involvement in human atherosclerosis. Experimental atherosclerosis in animals furnishes an important research tool, 
but extrapolation to humans requires care. Understanding how to combine experimental and clinical science will provide 
further insight into atherosclerosis and could lead to new clinical applications. 


reviews that describe the biological and genetic bases of athero- 

sclerosis’ *. Despite this progress, the leap from experimental 
animal findings to human atherosclerosis and clinical application pre- 
sents challenges. The laboratory literature and experimental community 
sometimes assume that the results obtained in cultured cells or animals 
closely correspond to humans. Although experimental work has helped 
to unravel some of the principles of atherosclerosis pathophysiology, gaps 
remain in translation to the clinic, and these breeches require bridging to 
achieve the full promise of scientific advances in atherosclerosis. 

This Review summarizes the burgeoning biological understanding of 
atherosclerosis. Instead of celebrating the astounding advances already 
achieved, we highlight some of the challenges to the clinical application 
of these advances. We also offer possible ways to move forward and 
overcome these obstacles. 


Pp owerful laboratory research in the past decade has led to many 


Current concepts of atherogenesis 

Atherogenesis refers to the development of atheromatous plaques in 
the inner lining of the arteries. On the basis of animal experiments and 
observations in human specimens, most contemporary schemes of 
atherogenesis posit an initial qualitative change in the monolayer of 
endothelial cells that lines the inner arterial surface (Fig. 1a). Arterial 
endothelial cells, which normally resist attachment of the white blood 
cells streaming past them, express adhesion molecules that capture 
leukocytes on their surfaces (Fig. 1b) when subjected to irritative stimuli 
(such as dyslipidaemia, hypertension or pro-inflammatory mediators). 
Parallel changes in endothelial permeability and the composition of 
the extracellular matrix beneath the endothelium promote the entry 
and retention of cholesterol-containing low-density lipoprotein (LDL) 
particles in the artery wall*. Biochemically modified components of 
these particles may induce leukocyte adhesion, and intact but modified 
particles undergo endocytosis by monocyte-derived macrophages, 
leading to intracellular cholesterol accumulation. Chemoattractant 
mediators direct the migration of the bound leukocytes into the 
innermost layer of the artery, the tunica intima (Figs 1b and 2). The 
localized distribution of atheromatous lesions in the arterial tree, 
despite a systemic rise in risk factors such as increased LDL levels or 
blood pressure, probably reflects differing haemodynamics in different 
segments of the arterial tree, distinction in the regional development 


of arteries’ and the ability of normal laminar shear stress to elicit an 
atheroprotective program of gene expression by the endothelium’. Once 
resident in the artery wall, monocytes — the most numerous white 
blood cells in plaques — differentiate into tissue macrophages. In the 
nascent atheroma, these mononuclear phagocytes engulf lipoprotein 
particles and become foam cells — a term that reflects the microscopic 
appearance of these lipid-laden macrophages. 

In mice, a pro-inflammatory subset of monocytes induced by 
hyperlipidaemia may preferentially furnish the precursors of lesional 
foam cells, but the fates and functions of this monocyte subset and its 
human equivalent remain under intense exploration’*. Macrophages in 
the atheroma may also have a pro-inflammatory palette of functions, 
characteristic of M1 macrophages’, which produce high levels of 
effectors such as the cytokines interleukin-16 (IL-1$) and tumour- 
necrosis factor (TNF). Some mononuclear phagocytes in plaques have 
the characteristics, and probably the antigen-presenting functions, 
of dendritic cells. Other leukocyte classes (such as lymphocytes) 
and mast cells also accumulate in atheromata, but less abundantly 
than phagocytes. Lesional T cells, although far fewer in number than 
macrophages, probably have key regulatory functions in plaques. 

Atheroma formation also involves the recruitment of smooth 
muscle cells (SMCs) from the tunica media — the middle layer of 
the artery wall — into the tunica intima (Fig. 1c). Unlike that of most 
experimental animals used to study atherosclerosis, the intima of 
human arteries (including the coronary arteries) contains resident 
SMCs. During atherogenesis, other SMCs migrate from the media into 
the intima, and proliferate in response to mediators such as platelet- 
derived growth factor. In the intima, the SMCs produce extracellular 
matrix molecules, including interstitial collagen and elastin, and 
form a fibrous cap that covers the plaque. This cap typically overlies 
a collection of macrophage-derived foam cells, some of which die 
(for example, by apoptosis) and release lipids that accumulate 
extracellularly. The inefficient clearance of dead cells — a process 
known as efferocytosis — can promote the accumulation of cellular 
debris and extracellular lipids, forming a lipid-rich pool called the 
necrotic core of the plaque”’. 

Plaques generally cause clinical manifestations by producing 
flow-limiting stenoses that lead to tissue ischaemia, or by provoking 
thrombi that can interrupt blood flow locally or embolize and lodge 
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in distal arteries. Paradoxically, thrombotic complications do not 
always occur at the sites of the most severe arterial narrowing by 
plaques. Instead, thrombi often arise after physical disruption of the 
plaque, most commonly a fracture of the fibrous cap that exposes pro- 
coagulant material in the plaque’s core to coagulation proteins in the 
blood, triggering thrombosis (Fig. 1d). Plaques that rupture typically 
have thin, collagen-poor fibrous caps with few SMCs but abundant 
macrophages. The inflammatory cells may hasten plaque disruption 
by elaborating collagenolytic enzymes that can degrade collagen, and 
by generating mediators that provoke the death of SMCs, the source 
of arterial collagen''. Plaque macrophages also produce the pro- 
coagulant tissue factor that renders the lipid core thrombogenic. Thus, 
the infiltrating inflammatory cells interact with the intrinsic arterial 
cells (smooth muscle and endothelium), promoting lesion formation 
and complications. 

The risk factors for atherosclerosis act at several points on 
this pathogenic pathway. Hypertension is a major risk factor for 
atheromata, and can increase arterial wall tension, leading to disturbed 
repair processes and aneurysm formation. Angiotensin II, a major 
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Figure 1 | Stages in the development of atherosclerotic lesions. The normal 
muscular artery and the cell changes that occur during disease progression 

to thrombosis are shown. a, The normal artery contains three layers. The 
inner layer, the tunica intima, is lined by a monolayer of endothelial cells 

that is in contact with blood overlying a basement membrane. In contrast 

to many animal species used for atherosclerosis experiments, the human 
intima contains resident smooth muscle cells (SMCs). The middle layer, or 
tunica media, contains SMCs embedded in a complex extracellular matrix. 
Arteries affected by obstructive atherosclerosis generally have the structure of 
muscular arteries. The arteries often studied in experimental atherosclerosis 
are elastic arteries, which have clearly demarcated laminae in the tunica 
media, where layers of elastin lie between strata of SMCs. The adventitia, the 
outer layer of arteries, contains mast cells, nerve endings and microvessels. 

b, The initial steps of atherosclerosis include adhesion of blood leukocytes 

to the activated endothelial monolayer, directed migration of the bound 
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pressor hormone, can alter endothelial function, inciting leukocyte 
adhesion. Cigarette smoking and diabetes also affect vascular biology, 
but through less well understood mechanisms. The role of cholesterol 
has been investigated in great detail, yielding success in cardiovascular 
prevention strategies. 


Lipids and atherosclerosis 

Lipids have a central role in the pathogenesis of plaques, but the 
mechanistic links between lipids and atherogenesis remain unclear. 
Observational data support a strong association between plasma 
lipid levels and the risk of cardiovascular disease’’. In particular, 
LDL levels satisfy modified Koch's postulates — criteria for judging 
whether a specific microbe is the cause of a disease — for causality of 
atherosclerosis'*. LDL levels correlate with the risk of cardiovascular 
events in human populations, and augment individual susceptibility 
to atherosclerosis and its complications. Monogenic disorders that 
raise plasma levels of LDL heighten cardiovascular risk. Several 
interventions that lower LDL levels by independent mechanisms 
diminish the likelihood of atherosclerotic events. 


Monocyte 
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leukocytes into the intima, maturation of monocytes (the most numerous 

of the leukocytes recruited) into macrophages, and their uptake of lipid, 
yielding foam cells. c, Lesion progression involves the migration of SMCs 
from the media to the intima, the proliferation of resident intimal SMCs 

and media-derived SMCs, and the heightened synthesis of extracellular 
matrix macromolecules such as collagen, elastin and proteoglycans. Plaque 
macrophages and SMCs can die in advancing lesions, some by apoptosis. 
Extracellular lipid derived from dead and dying cells can accumulate in the 
central region of a plaque, often denoted the lipid or necrotic core. Advancing 
plaques also contain cholesterol crystals and microvessels. d, Thrombosis, 
the ultimate complication of atherosclerosis, often complicates a physical 
disruption of the atherosclerotic plaque. Shown is a fracture of the plaque’s 
fibrous cap, which has enabled blood coagulation components to come into 
contact with tissue factors in the plaque’s interior, triggering the thrombus that 
extends into the vessel lumen, where it can impede blood flow. 
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The LDL success story lacks a final chapter 

The determination of the LDL pathway and therapy with inhibitors 
of hydroxymethyl glutaryl coenzyme A reductase (collectively known 
as statins), which regulate this pathway, are conspicuous victories of 
cardiovascular science and medicine”. But even in patients treated 
with statins, a considerable residual burden of cardiovascular risk 
remains’*. More than 20% of patients will have a recurrent event within 
30 months of an acute coronary syndrome, despite receiving high-dose 
statin treatment'®. These findings indicate that treatments to decrease 
LDL levels even further, beyond the targets currently mandated by 
various national guidelines, could provide further clinical benefit. 
Unfortunately, at least one-quarter of high-risk patients who receive 
intensive statin therapy have LDL levels above current guideline- 
mandated goals'’”. New biological targets have emerged that may 
yield incremental lowering of LDL levels to a greater degree than that 
achieved by high-dose statin therapy (Box 1). 


HDL as a frustrating next frontier 

Consistent evidence has shown that levels of high-density lipoprotein 
(HDL) correlate inversely with cardiovascular risk. Numerous 
approaches to increase HDL exist or are in development. Because 
of the heterogeneity in HDL particles, the complicated pathways of 
cholesterol flux mediated by HDL and the association of HDL with 
many proteins that may modify atherosclerosis, the steady-state levels 
of HDL cholesterol in blood reflect HDL function poorly. HDL particles 
can effect reverse cholesterol transport, and transfer cholesterol from 
peripheral tissues to the liver for excretion. This process involves the 
unloading of cholesterol from lipid-laden macrophages in atheromata 
by means of membrane-bound ATP-binding cassette transporters. 
Mature HDL interacts with one ATP-binding cassette transporter 
(ABCG1), and nascent HDL with another (ABCA1)"*"” (Fig. 2). 

In addition to mediating reverse cholesterol transport, HDL can 
exert anti-inflammatory actions both in vitro and in vivo”. HDL 
particles associate with dozens of proteins, many with biological 
activities that have relevance to atherogenesis”. The lipid content of 
HDL particles can be remodelled — for example, the plasma protein 
cholesteryl ester transfer protein (CETP) facilitates the exchange of 
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cholesteryl esters in HDL for triglycerides from apolipoprotein-B- 
containing lipoproteins*'” (Fig. 2). The protein content of HDL 
particles can also be remodelled — for example, when plasma 
levels of the acute-phase reactant serum amyloid A increase during 
inflammatory states”. Typical clinical assays for HDL do not reflect 
this high degree of heterogeneity of the particles that influence plaque 
biology. Thus, the mere increase in HDL levels in response to some 
interventions may not necessarily confer clinical benefit, owing to 
qualitative changes in the particles. By contrast, the lowering of LDL 
levels usually reduces cardiovascular event rates. Of the approaches 
to increase HDL under study, the potential of CETP inhibition to 
improve outcomes remains unclear. The CETP inhibitor torcetrapib 
failed in the clinic, probably owing to off-target effects'*°*, and two 
other CETP inhibitors, dalcetrapib and anacetrapib, have entered 
clinical evaluation. The safety of anacetrapib was recently affirmed 
by a phase III clinical trial, which provided preliminary evidence for 
reduced clinical events”. Ultimately, the results of continuing large 
end-point trials should settle the CETP controversy. 

Apolipoprotein A-I (Apo-AlI), the major protein component of 
HDL, has received much attention as a possible therapeutic target for 
atherosclerosis**°. But difficulties have plagued the development of 
protein therapeutics and mimetics. Despite small biomarker studies that 
suggest possible efficacy of some such agents, various limitations have 
stalled their entry to trials that could show efficacy in cardiovascular 
event reduction. 

Manipulation of the transcription of APOA1 has proven elusive, 
with only one agent in development for this purpose. Stimulation of 
the nuclear receptor peroxisome proliferator-activated receptor-a 
(PPAR-a) moderately increases Apo-AI levels. Moreover, preclinical 
and biomarker studies have suggested beneficial vascular actions of 
PPAR-a agonism that do not depend on Apo-AI”". Clinical trials of 
one agent with PPAR-a-stimulating activity, gemfibrozil, have shown a 
reduction in cardiovascular events’. Unfortunately, the combination 
of gemfibrozil with statins raises major safety concerns, owing to a well- 
defined drug-drug interaction. Another agent with relatively weak 
PPAR-a-agonist action, fenofibrate, has not reduced events in several 
large clinical trials****. 


BOX1 


Here we consider biological targets that may reduce LDL levels to a 
greater extent than that obtained by high-dose statin therapy. 


Niemann-Pick C1-like protein 1 

Inhibition of the intestinal cholesterol transporter Niemann-Pick C1-like 
protein 1 (NPC1L1) by the agent ezetimibe can reduce LDL levels by 
almost 20% in individuals already being treated with statins®’. Although 
combined therapy with statins and ezetimibe can help more individuals 
to reach mandated LDL targets for their level of risk, no clinical trial data 
have so far shown that this strategy will lower cardiovascular event rates 
beyond the drop produced by statin monotherapy. Studies of biomarkers 
such as the thickness of the carotid artery intima media, flow-mediated 
vasodilation, or inflammation cannot supplant lacking of data on clinical 
events. This example emphasizes three important points: (1) the need 
to choose biomarkers carefully to be pursued in clinical development; 
(2) the ultimate requirement for clinical end-point studies to determine 
the efficacy of interventions; and (3) the value of starting such definitive 
studies early in drug-development programmes. 


Proprotein convertase subtilisin/kexin type 9 
Genetic studies have shown that mutations in the gene that encodes 


New biological targets for lowering LDL levels 


the enzyme proprotein convertase subtilisin/kexin type 9 (PCSK9) 
augment LDL receptor levels on cell surfaces, boosting LDL clearance 
and yielding lower LDL concentrations in the blood®°. The enzymatic 
activity of PCSK9 — autocatalysis — does not directly degrade LDL 
receptors. Although enzymes generally make good drug targets, 
the autocatalytic activity of PCSK9 has proven difficult to inhibit 
by conventional medicinal chemistry approaches, and does not 
necessarily reflect its regulation of LDL receptor levels, spurring the 
development of biological agents that seek to limit PCSK9 action. 
Individuals with loss-of-function variants in PCSK9, who are exposed 
to lower levels of LDL from childhood than those with the common 
genotype for this enzyme, seem protected from atherosclerotic 
events even when they have other cardiovascular risk factors®!. This 
observation suggests that lowering LDL levels for longer periods than 
those encompassed by typical clinical trials should continue to provide 
benefit, and supports a pivotal, perhaps permissive, role for LDL in 
atherosclerosis. Such genetic data also help to clarify the importance 
of LDL lowering compared with other potential mechanisms of the 
benefits of statins (for example, statins interfere with prenylation of 
small G proteins, modulates lipid-raft organization and activates of 
Kriippel-like factor 2)'8?2%, 
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Figure 2 | The intersection of inflammation and lipid metabolism 
modulates atherosclerosis and provides potential targets for therapeutic 
manipulation. Atherogenesis begins with the recruitment of inflammatory 
cells to the intima. Activated endothelial cells express leukocyte adhesion 
molecules that capture blood monocytes, including (but not exclusively) the 
pro-inflammatory subset marked by high expression levels of the cell-surface 
protein Ly6C in mice. After inflammatory activation, monocytes recruited 
to the intima express scavenger receptors that permit the uptake of modified 
LDL particles, such as oxidized LDL (oxLDL). Cholesterol loading leads to 
the formation of foam cells, and ultimately leads to the mature lipid-laden 
macrophages of the plaque’s core. These cells can produce pro-inflammatory 
mediators, reactive oxygen species, and tissue factor pro-coagulants that 
amplify local inflammation and promote thrombotic complications. 
Although fewer in number than the mononuclear phagocytes, T cells also 
enter the intima and send decisive regulatory signals. After antigen-specific 
activation, T helper 1 (T,,1) cells secrete the signature cytokine interferon-y 
(IFN-y), which can activate vascular wall cells and macrophages, and magnify 
and sustain the inflammatory response in the intima. Regulatory T (T,,.) 
cells produce interleukin-10 (IL-10) and transforming growth factor-B 
(TGF-f), two cytokines considered to exert anti-inflammatory actions. 
Although not numerically prominent in the plaque, B cells accumulate and 


Nicotinic acid raises HDL levels effectively and has shown some event 
reduction in clinical trials, but tolerability issues have limited its use. 
The recognition of G-protein-coupled receptors for nicotinic acid, and 
of B-hydroxybutyrate as an endogenous ligand of one such receptor, 
has not yet led to a therapeutic approach”. Clinical end-point trials 
are testing whether adding nicotinic acid to standard care (usually 
including statin treatment) improves cardiovascular outcomes*’. One 
trial is using an extended-release preparation of nicotinic acid combined 
with a prostaglandin D, receptor antagonist, intended to reduce the 
cutaneous flushing that limits the acceptability of high-dose nicotinic 
acid for many patients”. 

Thus, despite considerable understanding of HDL and its metabolism, 
none of the pharmacological agents tested so far has offered a practical 
and proven way to reduce cardiovascular events. We must await the 
results of ongoing trials of approaches to raise HDL levels to reach this 
elusive goal. 


Triglycerides on trial 

Fasting or non-fasting triglyceride levels can predict cardiovascular 
events, but adjustment for other risk factors considerably weakens 
or even abolishes the association”. Lifestyle changes such as 
weight loss, physical activity or low-carbohydrate diets can lower 
blood-circulating levels of triglycerides. The clinical benefits of 
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organize in the perivascular tissue surrounding atherosclerotic arteries. 
They produce circulating antibodies that may limit inflammation and mute 
atherogenesis. In addition to modified LDL, triglyceride-rich lipoproteins 
such as very-low-density lipoprotein (VLDL) — particularly those particles 
that bear apolipoprotein C-III (Apo-CIII) or apolipoprotein B (Apo-B) — 
can instigate vascular inflammation through Toll-like receptor 2 (TLR2) 
signalling. Macrophage foam cells can efflux cholesterol (chol) through 
ATP-binding cassette (ABC) transporters, which work in tandem. ABCA1 
loads cholesterol-poor nascent high-density lipoprotein (HDL) particles 
(pre-6 HDL) with cholesterol. ABCG1 can load more mature HDL particles 
with cholesterol. Having taken up cholesterol through interaction with the 
ABC transporters in the artery wall, HDL particles can exit through the 
bloodstream, contributing to reverse cholesterol transport from lesional 
macrophages to the periphery. VLDL and LDL particles bearing ApoB can 
unload cholesterol from HDL particles through the action of cholesteryl ester 
transfer protein (CETP). Blockade of CETP can thus augment HDL levels, 
a process not yet known to produce clinical benefit. The ApoB-containing 
lipoproteins can promote clearance of cholesterol through capture by 
peripheral LDL receptors. Loss-of-function mutations in the enzyme PCSK9 
(not shown) can increase the number of LDL receptors on peripheral cells, 
thereby augmenting the clearance of LDL. 


such lifestyle modifications probably result from a combination 
of mechanisms, so they cannot affirm triglycerides as a causal risk 
factor for atherosclerosis. Strict control of diabetes can also lessen 
hypertriglyceridaemia, yet tight glycaemic control may increase, rather 
than prevent, clinical complications of atherosclerosis in people with 
type 2 diabetes*’”’. Fibrates effectively decrease triglyceride levels, 
but trials of these drugs have proven disappointing in reducing 
clinical events, and have not shown reductions in mortality rates”. 
Omega-3 fatty acids, prominent constituents of certain fish oils, 
reduce triglyceride levels and can limit cardiovascular events in some 
populations. These fatty acids also have anti-arrhythmic action, can 
mute inflammation and impair platelet aggregability — precluding a 
conclusion about the extent to which their clinical benefit arises from 
lowered triglyceride levels™. 

Recent evidence suggests that fractions of triglyceride-rich 
lipoproteins, particularly those that contain Apo-CII, confer risk 
not conveyed by the total triglyceride level. Indeed, Apo-CIII acts 
as a pro-inflammatory mediator and an endogenous ligand of the 
Toll-like receptor 2 signalling pathway, which is implicated in the 
aggravation of mouse atherosclerosis**“° (Fig. 2). In addition, very- 
low-density lipoprotein (VLDL) promotes transcription of the 
plasminogen activator-1 gene, leading to a reduced capacity to lyse 
thrombi”. 
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Oxidation unproven to boost human atherogenesis 
Experimental data indicate that oxidized LDL (oxLDL) within the 
arterial wall promotes the development of atherogenesis ”° (Fig. 2). 
Yet direct evidence for the participation of LDL oxidation in human 
atherosclerosis remains scarce. Furthermore, the transition-metal- 
dependent chemistry that is commonly used to produce oxLDL in the 
laboratory may not reflect oxidative processes at work in the artery wall 
during atherogenesis. Such in vitro preparations of oxLDL vary by day, 
by donor and by laboratory, and constitute highly heterogeneous and 
undefined mixtures of biologically active substances. 

Well-powered randomized clinical trials have consistently shown 
that antioxidant vitamins, such as vitamins E and C, do not reduce 
cardiovascular events over the timescale and doses studied*’. For 
many reasons, though, these studies of antioxidant vitamins do not 
themselves invalidate the oxidant hypothesis of atherosclerosis*’. One 
potent non-vitamin antioxidant (succinobucol) also failed to reduce 
coronary events in a large-scale trial™. 

Certain phospholipases may generate toxic or pro-inflammatory 
moieties from oxidized phospholipids associated with oxLDL. Clinical 
end-point trials with inhibitors of two of these enzymes are under way, but 
a small biomarker study with a lipoprotein-associated phospholipase A, 
inhibitor did not meet its prespecified coprimary end points” — again 
illustrating the quandary faced in pharmaceutical development about the 
design of smaller, short-term biomarker studies that can guide decisions 
about large clinical-outcome trials. Thus, despite the emergence of 
chemically defined active lipids and numerous publications about the 
potentially pro-atherogenic effects of oxLDL, its in vivo relevance in 
humans and its therapeutic manipulation remain speculative. 


Translation from mice to humans 

The discovery of T cells in human atheromata, and the subsequent 
identification of almost all cell types involved in innate and adaptive 
immunity in human plaques, raised the possibility that the immune 
system could participate in atherogenesis ** (Fig. 2). Markers 
of local adaptive and innate immune activation (such as major 
histocompatibility complex (MHC) antigens and leukocyte adhesion 
molecules) in plaques suggested the functional significance of immune 
cells in lesions. Studies in mice have established important modulatory 
roles of immunity in experimental atherosclerosis. T helper 1 (T,1) 
cells, which produce the pro-inflammatory cytokines interferon-y 
(IFN-y) and TNF, have powerful pro-atherosclerotic actions in 
hypercholesterolaemic mice. This situation resembles that in several 
other inflammatory and autoimmune diseases, such as rheumatoid 
arthritis and type 1 diabetes, raising the question of whether specific 
antigens drive atherosclerosis. Two candidate autoantigens have 
emerged most prominently: LDL and heat-shock protein 60 (HSP60)”. 
Cellular and humoral immune responses are mounted towards these 
antigens in humans and other experimental animals, and protective 
immunization strategies in mice have provided encouraging results”. 
Of note, the cellular immune response to LDL specifically targets 
components of the native LDL particle rather than any oxidation- 
induced epitopes”. Thus, fully oxidized LDL particles may not activate 
an adaptive immune response. For both LDL and HSP60, the extent 
to which the autoimmune response involves molecular mimicry with 
microbial pathogens remains undetermined. 

Both stimulatory and inhibitory immune mechanisms operate 
during atherogenesis in hyperlipidaemic mice” (Fig. 2). Anti- 
inflammatory cytokines, such as IL-10 and transforming growth 
factor-B, counterbalance the pro-inflammatory pathways. Regulatory 
T (T,,,) cells produce anti-inflammatory signals that tend to counteract 
Tyl-cell production of IFN-y —a cytokine that has long been implicated 
in atherogenesis™®. By contrast, some (but not all) studies indicate a pro- 
atherogenic role for T,,17 cells, which are a source of IL-17 (refs 66-70). 
B cells may have a protective role in atherosclerosis, as the removal of 
specific B-cell populations by splenectomy, for example, can aggravate 
atherosclerosis, and the transfer of antibodies reactive to LDL and its 
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components has shown atheroprotective effects in mice”. Other 
experiments paradoxically show reduced atherosclerosis in mice after 
B-cell depletion”, suggesting that B-cell subpopulations may have 
contrasting roles in the pathogenesis of atherosclerosis. 

Although these principles could apply to human atherosclerosis, 
direct translation from mouse studies is problematic. The accelerated 
atherogenesis in mice contrived to have hypercholesterolaemia requires 
cholesterol levels that far exceed those commonly encountered in the 
clinic, and does not reflect the chronic nature or complexity of the human 
disease”. An approach to this problem would be to develop and adopt 
mouse models with lipoprotein metabolism closer to that in humans, or 
to use more moderate levels of dyslipidaemia over longer time periods. In 
studies of adaptive immunity, mice that are ‘humanized’ to carry human 
MHC genes may prove informative about antigen specificity. 

The mouse immune system, although well understood and readily 
manipulated, diverges in many ways from that of humans”. For 
example, humans lack the clear-cut T,;1 and T,,2 polarization found 
in mice. FOXP3 expression is a useful marker of T,,., cells in mice, but 
evidence suggests that it does not show the same degree of fidelity 
with T,,.-cell functionality in humans”. The concept of polarization of 
macrophage functions probably does apply to humans, but the markers 
of classical (M1) versus alternative (M2) activation patterns in mice 
(for example, inducible nitric oxide synthase and arginase-1) differ 
from those in humans”. Therefore, whenever possible, findings in 
mice should stimulate parallel studies in human biobanks and clinical 
studies. Identification of a putative disease-promoting molecule 
in human lesions, for example, or an increased risk associated with 
a genetic variant encoding the molecule, would lend value to a 
mechanistic study in gene-targeted mice. 

Despite the disparities between the mouse and human immune systems 
(both innate and adaptive), clinical attempts at immunomodulation of 
atherosclerosis have begun. In particular, the recognition that humoral 
immunity can confer protection against experimental atherogenesis has 
spawned clinical trials involving the infusion of anti-LDL antibodies 
intended to reduce atherosclerosis. Vaccination studies with immunogens 
derived from LDL are also under way; experimental studies indicate that 
they evoke atheroprotective immunity involving cellular and humoral 
responses”. 


Inflammation in atherosclerosis at the crossroads 

A unifying view of the pathophysiology of atherosclerosis proposes 
that inflammation has a key role and transduces the effects of many 
known risk factors for the disease’”””*. Inflammatory signalling alters 
the behaviour of the intrinsic cells of the artery wall (endothelium 
and smooth muscle), and recruits further inflammatory cells that 
interact to promote lesion formation and complications (Figs 1 
and 2). The application of biomarkers of inflammatory status, such 
as C-reactive protein (CRP), has lent clinical credence to this concept, 
but not without controversy. There remain many unanswered 
questions about the application of inflammation biology to human 
atherosclerosis. The association of inflammatory biomarkers with 
future risk of atherosclerotic complications does not demonstrate 
causality. Although the combined experimental and clinical evidence 
may convince some, the chicken-and-egg problem about causality 
remains unresolved. 

Another open issue surrounds the use of anti-inflammatory therapy 
as a treatment for atherosclerosis. Many traditional anti-inflammatory 
therapies do not improve cardiovascular outcomes, and some may even 
aggravate atherosclerotic events. These observations, usually derived 
from post-hoc analyses of clinical studies or from mining observational 
databases, may reflect off-target actions of the agents studied — such 
as glucocorticoids, non-steroidal anti-inflammatory drugs, certain 
PPAR agonists or TNF inhibitors. ‘Cardioprotective’ doses of aspirin 
(50-150 mg daily) probably act as an antiplatelet agent rather than as a 
direct anti-inflammatory intervention. Despite the anti-inflammatory 
action of inhibitors of the cyclooxygenase-2 (COX-2) enzyme, the 
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BOX2 


Here we describe two randomized, placebo-controlled trials 
that assess the efficacy of proven anti-inflammatory agents as 
cardiovascular therapeutic agents. 


The cardiovascular inflammation reduction trial 

This trial proposes to randomly allocate stable patients with post- 
myocardial infarction, who are receiving a complete standard-care 
regimen (including high-dose statin therapy), to either low-dose 
methotrexate (10-15 mg per week) or placebo™. The treatment of 
rheumatoid arthritis routinely uses low-dose methotrexate, which has 
anti-inflammatory efficacy and an acceptable safety record among 
patients with similar age and co-morbidity status as individuals 

with stable coronary disease. Data from seven non-randomized 
observational cohorts of patients with rheumatoid arthritis or psoriatic 
arthritis demonstrate significant reductions in vascular event rates and 
cardiovascular death among individuals taking low-dose methotrexate 
rather than other disease-modifying agents. As low-dose methotrexate is 
a generic drug, a successful outcome for the trial would provide a simple, 
cost-effective method to address residual risk related to inflammation. 


Clinical trials evaluating anti-inflammatory agents 


The Canakinumab Anti-inflammatory Thrombosis Outcomes Study 
This study proposes to address directly whether, compared with 
placebo, IL-1 inhibition can reduce the rates of recurrent myocardial 
infarction, stroke and cardiovascular-associated death among stable 
patients with coronary artery disease on a background of standard- 
care therapy (P.M.R,, T. Thuren, A. Zalewski and PL., manuscript 

in preparation). Canakinumab, a human monoclonal antibody, 
neutralizes the pro-inflammatory cytokine IL-1, which is implicated 
in atherothrombosis. Cholesterol crystals stimulate the NLRP3 
inflammasome, which generates the active form of IL-1 (refs 95, 96) 
(Fig. 1c). Canakinumab significantly reduces levels of inflammatory 
biomarkers such as CRP, and is currently used to treat inherited IL-1B- 
driven inflammatory diseases such as Muckle—-Wells syndrome. Because 
IL-1B may participate in autoimmune processes related to pancreatic 
dysfunction and insulin resistance, this study also has a secondary 
prespecified end point of new-onset diabetes. If successful, the trial 
would support the inflammatory hypothesis of atherothrombosis, and 
provide a new cytokine-based therapy for the secondary prevention of 
cardiovascular disease and new-onset diabetes. 


pro-thrombotic effect of inhibiting prostacyclin production may 
contribute to increased cardiovascular morbidity”. 

Statins effectively lower LDL and CRP levels in humans. Analyses of 
several large studies of statins in primary- and secondary-prevention 
populations suggest that some of their clinical benefit accrues from 
an anti-inflammatory action distinct from LDL lowering*’”’. The 
hypothesis that an anti-inflammatory intervention can reduce 
cardiovascular events independent of lipoprotein effects still requires 
rigorous testing. Thus, despite hundreds of studies affirming a role 
for inflammation in atherosclerosis in mice, and many intriguing 
observations in humans, Koch’s postulates remain unfulfilled. 

Ultimately, testing the inflammatory hypothesis of atherothrombosis 
will require a series of randomized, placebo-controlled trials that 
evaluate proven anti-inflammatory agents without confounding effects 
on cholesterol or platelet function as cardiovascular therapeutic agents. 
At least two such trials (Box 2) should begin soon, targeting a high-risk 
population with persistent inflammation, thus limiting the intervention 
to those most likely to benefit. 


Animal experiments versus human disease 

What lessons can we learn from the frustrations in clinical application of 
advances in atherosclerosis biology, and how can we tighten the coupling 
between scientific advances and clinical practice? Animal experiments 
have proven indispensable to studies of disease mechanisms, but we 
must not forget their limitations. Too often, the pharmaceutical or 
biotechnology sector adopts or abandons targets or strategies on the 
basis of uncritical acceptance of the results of animal studies. The 
recognition of animal preparations as ‘models’ of human disease 
requires considerable scepticism. For example, atherosclerotic lesions 
in the commonly used genetically modified mice seldom develop 
plaque disruption with thrombosis — a mechanism that commonly 
complicates the human disease. Mouse studies generally focus on the 
aorta and proximal great vessels, whereas the most important clinical 
consequences of atherosclerosis in humans arise from lesions in the 
coronary, carotid and cerebral arteries. The structure and hydrodynamics 
of these smaller muscular arteries, and even the embryonic origin of their 
SMCs, differ markedly from the large elastic arteries usually analysed 
in mouse studies. The proximal left anterior descending coronary 
artery in humans, a frequent site of lesion formation, characteristically 
contains a considerable population of intimal SMCs, even in early life 
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—a major difference from mouse arteries. Coronary arterial SMCs arise 
from the proepicardial organ, not from the splanchnic mesoderm as do 
those in the descending aorta’. In contrast to the aorta, in which flow 
predominates in systole, flow in the coronary arteries occurs mainly in 
diastole, and the heart's arteries experience compressive forces during 
systole. Human coronary arteries usually lie in an extensive pedicle of 
perivascular fat that may provoke outside-in signalling. 

These distinctions by no means indicate that we should discard 
animal studies of atherosclerosis, or forgo the immense power of mouse 
genetics to pose questions about pathophysiology. But animal studies 
do require judicious interpretation, and recognition of their limitations, 
when extrapolating to human disease. Experimental atherosclerosis in 
animals allows the rigorous testing of mechanistic hypotheses, but does 
not mimic the human condition entirely. 


The importance of biomarkers 

Advances in proteomic, metabolomic and genetic technologies have led 
to the accelerated identification of putative biomarkers of disease and 
risk factors for complications, and the targeting or improved efficacy 
of therapies. Selective harnessing of biomarkers can help to gauge the 
relevance of experimental results to human disease. For example, a 
highly sensitive assay to measure CRP has helped to translate to the 
clinic the results of decades of laboratory studies that implicated 
inflammatory pathways in the pathogenesis of atherosclerosis. 

Genome-wide association studies have reproducibly identified 
and validated regions of the human genome that associate with the 
risk of myocardial infarction. For example, the chromosome 9p21 
region, which consistently associates with a greater risk of myocardial 
infarction, has begun to yield new biological insight”, as have several 
variants associated with lipoprotein disorders***’. This unbiased 
approach will identify therapeutic targets that have eluded the classical 
model of drug development. Most identified genetic risk factors 
contribute moderately to disease and do not yet justify population 
screening’’**. Because many risk-conferring genes may make only 
small contributions to risk, common variants (compared with rarer 
mutations) may not prove useful in risk prediction. 

In addition to bridging laboratory and human studies, the application 
of biomarkers could help to advance the treatment of people with, or at 
risk of, atherosclerosis by improving prognostication, by assessing the 
need for and intensity of therapy, by individualizing the use of specific 
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therapies, and by helping to develop new therapeutics. For example, 
including CRP with conventional risk factors improves risk prediction 
for atherosclerotic events, both in people with and without established 
disease. Sensitive cardiac troponin measurements can detect levels 
of ischaemic damage far beneath the clinical threshold and convey 
incremental risk information. The serum protein cystatin-C and brain 
natriuretic peptides may also have clinical use in risk prediction. 

A randomized, multicentre trial that finished in 2008 illustrates how 
the inflammation biomarker CRP can identify individuals who are not 
eligible for therapy according to traditional approaches, but who could 
benefit from treatment with a statin that has potent LDL-lowering and 
anti-inflammatory effects”. Yet this study could not, nor was it designed 
to, determine the mechanism of event reduction, reinforcing the need 
for future trials of anti-inflammatory agents that do not alter lipid levels. 

Beyond risk prediction and targeting of therapy, the application of 
biomarkers may help the field of cardiovascular therapeutics to confront 
the enormous challenge it faces — to discover and develop therapeutics 
for modulating atherosclerosis. Owing to the success of the current 
standard of care, clinical end-point trials now generally involve patient 
populations with lower event rates. Consequently, clinical trials that pit 
new strategies for reducing atherosclerotic events against the current 
standard of care will require greater numbers of participants and longer 
study durations than in previous eras, with attendant greater expense. 
Better validation methods are needed for the targets arising from the 
burgeoning basic science of atherosclerosis in humans. Genome-wide 
association studies are identifying numerous new targets for drug 
development. Application of this knowledge in humans will require 
methods to determine whether interventions will affect their intended 
targets, to optimize doses, and to obtain early signals compatible with 
clinical benefit in pilot studies of fewer subjects and shorter duration. 
Such methods could inform decisions about which agents should move 
forward into increasingly expensive and arduous large clinical trials. 

There is no single optimum biomarker for reporting the possible 
clinical efficacy of new therapeutics. Biomarker selection for these 
purposes should reflect the mechanisms under scrutiny. Atheroma 
volume measured by intravascular ultrasound, for example, might be 
an appropriate biomarker for an intervention designed to unload lipid 
from plaques, such as an Apo-AI mimetic; the level of lipoprotein- 
associated phospholipase A, in peripheral blood leukocytes would be an 
appropriate biomarker for determining the dose range of an inhibitor of 
that enzyme (in the absence of direct access to the relevant tissue — the 
human plaque itself); and CRP measurement could serve as a marker to 
assess an anti-inflammatory intervention that may not affect plaque size. 

Biomarkers include traditional analytes in body fluids and 
anthropometric measurements, and can include, by some definitions, 
structural variables measured by imaging. Biomarkers are unlikely 
to provide surrogate end points of efficacy that prove acceptable to 
regulatory agencies for the registration of new therapeutics in the 
foreseeable future, but they should assist in bridging the translational gap. 

In addition to anatomical imaging, harnessing biological processes 
to provide imaging targets (molecular imaging) may help to test 
mechanistic hypotheses in humans, and provide early signals about 
the efficacy of interventions in small and short pilot studies*. With 
respect to atherosclerosis, molecular imaging using different platforms 
has proven promising in visualizing adhesion molecules, integrins, 
phagocytosis, proteases, reactive oxygen species and modified 
lipoproteins®*. The modalities that have shown potential in this regard 
include isotope-tagged ligands, paramagnetic agents visualized by 
magnetic resonance imaging, contrast-enhanced ultrasound and 
near-infrared fluorescent probes. Such methodologies might result 
in crucial information for phase II drug development, including 
ascertainment of in vivo targeting in humans (not just in the blood, 
but in the atheroma itself), and provide human data about doses for 
clinical end-point studies. 

Although molecular imaging of atherosclerosis shows promise in 
animals, it faces great hurdles to clinical translation. The production 
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of molecular probes for human use requires good manufacturing 
processes, toxicology evaluation and, often, the extension of innovative 
imaging platforms beyond the pilot stage. Overcoming these barriers 
requires resources beyond the reach of most academic groups, 
necessitating governmental, industrial or philanthropic support. 


Clinical trials as a laboratory for discovery 

Clinical trials should be used more often as an early scientific probe, 
not just as a pathway to the commercialization of pharmaceuticals or 
for evaluating comparative efficacy of established agents. Although 
daunting to design, fund and conduct, clinical trials constitute the 
ultimate translational tool. The publications reporting many laboratory 
studies convey an optimistic speculation about clinical extrapolation. 
A deep and wide chasm separates the promises in these sentences 
and a randomized, prospective clinical trial that tests the conjecture. 
Prohibitive practical limitations impose themselves, and not many 
hypotheses arising from laboratory studies will undergo such rigorous 
clinical evaluation; hence, it is necessary to harness biomarkers more 
effectively to identify strategies that have the most promise for clinical 
translation. Clinical trialists should strive to archive biobanks and 
build biomarker sub-studies into clinical trials whenever possible, to 
allow post-hoc data mining, generate new hypotheses, and test those 
mechanistic hypotheses already specified. 

The increasing expense of clinical end-point trials, driven by the 
considerations explained above, constitutes a major limitation to the 
translation of biological advances to atherosclerosis treatment. The 
daunting costs of cardiovascular clinical trials have diverted investments 
of the pharmaceutical industry to other therapeutic areas, reducing 
the discovery effort and limiting the number of approaches that will 
undergo clinical evaluation. Models for public support of trials to test 
crucial hypotheses, including those that may have little commercial 
appeal, for funding of ancillary mechanistic studies or sub-studies, and 
for improvements in trial designs to render them less costly would help 
to surmount these barriers. 

The biological insights and experimental progress in understanding 
the mechanisms of atherosclerosis and its complications have advanced 
markedly. But full understanding of the applicability of laboratory 
findings to humans and the realization of therapeutic promise require 
another investigative dimension. We must reach beyond the tools 
available in the laboratory to probe pathophysiology, and more urgently 
strive to bridge the gap to human disease. m 
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Heart regeneration 


Michael A. Laflamme' & Charles E. Murry’? 


Heart failure plagues industrialized nations, killing more people than any other disease. It usually results from a 
deficiency of specialized cardiac muscle cells known as cardiomyocytes, and a robust therapy to regenerate lost myocardium 
could help millions of patients every year. Heart regeneration is well documented in amphibia and fish and in developing 
mammals. After birth, however, human heart regeneration becomes limited to very slow cardiomyocyte replacement. Several 
experimental strategies to remuscularize the injured heart using adult stem cells and pluripotent stem cells, cellular repro- 
gramming and tissue engineering are in progress. Although many challenges remain, these interventions may eventually lead 


to better approaches to treat or prevent heart failure. 


extremely controversial, for more than 150 years’. In pursuit 

of this subject, the heart has been stabbed, snipped, contused, 
cauterized, coagulated, frozen, injected with toxins, infected and 
infarcted, in species ranging from marine invertebrates to horses*”. 
Why has this proven to be such a difficult challenge? The heart is one 
of the least regenerative organs in the body, so if there is a regenerative 
response, it is small in comparison to that seen in many other tissues, 
such as liver, skeletal muscle, lung, gut, bladder, bone or skin. For most 
investigators, the question is about whether there is no regeneration, 
which is intrinsically difficult to prove, or whether it occurs but at very 
low rates, which is not easy to detect but possible using highly sensitive 
approaches. 

This is more than an academic argument. Heart failure is a burgeoning 
public health problem, and some predict that it will reach epidemic 
proportions as our population ages. Cardiomyocyte deficiency underlies 
most causes of heart failure. The human left ventricle has 2-4 billion 
cardiomyocytes, and a myocardial infarction can wipe out 25% of these 
in a few hours’. Disorders of cardiac overload such as hypertension or 
valvular heart disease kill cardiomyocytes slowly over many years’, and 
ageing is associated with the loss of ~1 g of myocardium (about 20 million 
cardiomyocytes) per year in the absence of specific heart disease’. If the 
human heart has even a small innate regenerative response, it may be 
possible to exploit this therapeutically to enhance the heart's function. 
This fundamental motivation has kept investigators pursuing rare events 
for more than a century. 

Over the past 15 years, researchers have taken a more interventional 
approach to the injured heart, creating the field of cardiac repair. 
The ultimate goal of cardiac repair is to regenerate the myocardium 
after injury to prevent or treat heart failure. This interdisciplinary 
field draws from advances in areas such as stem cells, developmental 
biology and biomaterials in an attempt to create new myocardium 
that is electrically and mechanically integrated into the heart. Cardiac 
repair has moved rapidly from studies in experimental animals to 
clinical trials involving thousands of patients. In this Review, we 
summarize the evidence for heart regeneration in animal models and 
humans. We discuss the status of research using adult stem cells and 
pluripotent stem cells for cardiac repair in experimental animals, and 
explore the promises and problems of cellular reprogramming and 
tissue engineering. Clinical trials will be covered only briefly, owing 
to space limitations, so we refer interested readers to recent reviews 
on this topic”®. 


| | eart regeneration has been intensely investigated, and 


Heart regeneration in amphibia and fish 

Unlike humans, many amphibia and fish readily regenerate limbs, 
appendages and internal organs after injury. There is a long history of 
research on amphibian heart regeneration’; more recently, the zebrafish 
has proven to be a particularly useful model, given its substantial 
regenerative capacity and amenability to genetic manipulation’®. The 
zebrafish heart fully regenerates after the surgical amputation of the 
cardiac apex — an injury that corresponds to a loss of approximately 
20% of the total ventricular mass”. In the low-pressure zebrafish heart, 
this large wound is effectively sealed by an initial fibrin clot, which is 
gradually replaced by de novo regenerated heart tissue rather than by 
scar tissue’. 

Not surprisingly, this regenerative response involves a substantial 
amount of cardiomyocyte proliferation. Even at baseline levels, 
zebrafish cardiomyocytes show a much higher degree of cell-cycle 
activity than equivalent cells from their mammalian counterparts. A 
recent study showed that approximately 3% of cardiomyocytes in the 
compact myocardium of uninjured adult zebrafish hearts incorporate 
the thymidine analogue bromodeoxyuridine (BrdU) during a seven-day 
pulse-labelling experiment. Two weeks after amputation of the cardiac 
apex, the fraction of BrdU-positive cardiomyocytes had increased 
by tenfold, and this parameter remained as high as 20% as late as one 
month after injury””. 

Initial experiments suggested that undifferentiated progenitor 
cells were the principal source of regenerating cardiomyocytes in 
zebrafish", but two recent genetic fate-mapping studies unambiguously 
demonstrated that pre-existing committed cardiomyocytes are instead 
the main source’”’ (Box 1). The two groups independently generated 
transgenic zebrafish in which the cardiomyocyte-specific cmlc2 (also 
known as myl7) promoter drives the expression of tamoxifen-inducible 
Cre recombinase. These animals were crossed with a reporter line, 
in which Cre-mediated excision of a loxP-flanked stop sequence 
induces constitutive expression of green fluorescent protein (GFP). In 
the offspring of this cross, all pre-existing cardiomyocytes and their 
progeny can be induced to express GFP by tamoxifen treatment. If 
the regenerated myocardium were derived from undifferentiated 
progenitor cells, the new ventricular apex should be GFP . Instead, 
both groups found that the vast majority of the newly regenerated 
cardiomyocytes were GFP" (refs 12, 13). Thus, heart regeneration in 
zebrafish is principally mediated by the proliferation of pre-existing 
cardiomyocytes, rather than the generation of new cardiomyocytes 
from stem cells. 
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BOX1 


Genetic fate mapping has proven to be an invaluable tool for dissecting 
the mechanisms of endogenous cardiac repair in model organisms, 
and several laboratories have used an elegant strategy based on the 
conditional Cre-/oxP system, which allows both temporal and cell-type- 
specific control of reporter expression. 

After amputation, the apex of the zebrafish heart can fully 
regenerate. To determine the source of the newly proliferating 
cardiomyocytes that underlies this regeneration, a zebrafish 
strain carrying two transgenes was created (see Figure, a). In one 
transgene, the cardiomyocyte-specific cm/c2 promoter drives 
the expression of tamoxifen-inducible Cre recombinase. In the 
second transgene, the constitutive B-actin promoter initially drives 
expression of the red fluorescent DsRed protein. Cre recombinase 
induces the excision of /oxP-flanked stop sequences, causing a 
permanent switch from constitutive DsRed to constitutive green 
fluorescent protein (GFP) expression’. Thus, when the transgenic 
zebrafish was pulsed with tamoxifen, all of its cardiomyocytes and 
their descendants expressed GFP. By contrast, cardiomyogenic 
progenitor cells should remain DsRed*GFP’, because the 
cardiomyocyte-specific cm/c2 promoter would not be active in 
these undifferentiated cells. If progenitor cells later contributed 
to cardiomyocyte renewal after injury, one would expect those 
cardiomyocytes to also be DsRed*GFP-. Instead, after amputation 
of the apex, the new apical myocardium was 100% GFP’, indicating 
that heart regeneration in the zebrafish results from the expansion 
of pre-existing cardiomyocytes, not from the recruitment of 
cardiomyogenic precursors. Another independent group reached 
the same conclusions using a similar experimental design’’. 

An analogous genetic fate- mapping approach was used to 
investigate the mechanisms of cardiac regeneration in mammalian 
hearts**. Here, a double-transgenic mouse was generated in which the 
cardiomyocyte-specific Myh6 promoter drives tamoxifen-inducible 
Cre recombinase, and Cre-mediated excision of loxP-flanked stop 
sequences induces a switch from constitutive B-galactosidase 
(B-gal) to constitutive GFP expression (see Figure, b). After tamoxifen 
treatment, ~80% of the cardiomyocytes in the transgenic animal 
became GFP’, and 20% remained B-gal*GFP-. As in the analogous 
zebrafish experiment, any progenitor cells should remain GFP after 
the tamoxifen pulse. During normal ageing for up to 1 year after 
tamoxifen treatment, the ratio of GFP” to B-gal” cardiomyocytes 
remained fixed at 80:20, indicating no significant cardiomyocyte 
renewal by unlabelled progenitor cells. However, after infarction, 
the ratio of GFP* to B-gal* shifted to ~65:35 in the peri-infarct zone, 


Genetic fate mapping in heart regeneration 


indicating that newly differentiated cardiomyocytes (B-gal*GFP- 
because they had not undergone Cre-mediated recombination) had 
been recruited from the progenitor pool. Hence, in mice, the small 
amount of regeneration that occurs after injury involves the cardiac 
induction of progenitor cells. 
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Limited regeneration in rodent hearts 

Although they lack the remarkable regenerative capacity of the zebrafish 
heart, postnatal mammalian hearts also undergo some degree of 
cardiomyocyte renewal during normal ageing and disease. Despite all 
the recent attention by the field, this is not a new concept. Extremely 
low but detectable levels of cardiomyocyte cell-cycle activity have been 
reported in rodent studies dating back to the 1960s*"*. Capturing the rare 
dividing cardiomyocytes present in mammalian hearts is technically 
challenging, but recent work has taken advantage of the greater 
specificity and throughput afforded by transgenic mouse models. For 
example, transgenic mice were created in which the cardiomyocyte- 
specific a-myosin heavy chain (Myh6, also known as a-MHC) 
promoter drives nuclear-localized expression of B-galactosidase’. This 
convenient read-out allowed researchers to screen more than 10,000 
cardiomyocyte nuclei in histological sections for the incorporation of 
radiolabelled thymidine, and they found labelling indices of 0.0006% 


for adult ventricular cardiomyocytes in intact hearts and 0.0083% for 
cardiomyocytes in the border zone of injured hearts'*”’. 

Although such proliferative indices are small, they raise the possibil- 
ity that such phenomena could be augmented therapeutically. Proof 
of concept for this approach has come from transgenic mice with 
cardiomyocyte-restricted overexpression of the cell-cycle activator cyc- 
lin D2, because these animals show reduced scar tissue and improved 
mechanical function after myocardial infarction’®. Other efforts to 
enhance the proliferation of adult cardiomyocytes — by manipulating 
oncogenes or cell-cycle regulators — have proven less consistent in 
improving outcomes after infarction (see ref. 17 for a comprehensive 
review). Pharmacological enhancement of cardiomyocyte cell-cycle 
activity would be more practical clinically than gene therapy, and the 
signalling molecules periostin”’, fibroblast growth factor-1 (ref. 19) 
and neuregulin 1 (NRG1)” have all been reported to act as mitogens 
for adult ventricular cardiomyocytes and to exert beneficial effects on 
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cardiac structure and function after infarction. (It should be noted 
that a more recent study has called into question the effects of peri- 
ostin on cell-cycle activity or cardiac repair’'.) A recent study of the 
mitogenic effects of NRG1 showed that simple systemic injection of 
this growth factor into adult mice enhanced infarct scar shrinkage 
and improved mechanical function”. The effects of NRG1 on the cell 
cycle were dependent on the expression of its tyrosine kinase receptor, 
ERBB4, by cardiomyocytes and seem to stimulate mononucleated, but 
not binucleated, cardiomyocytes to divide. Although this intriguing 
result awaits independent confirmation, it suggests a straightforward 
approach to enhancing ventricular repair through the administration 
of recombinant growth factors. 

Most of the above studies focused on the proliferation of existing 
cardiomyocytes, and were not designed to detect cardiomyocytes 
formed from progenitor cells. To determine whether such progenitor 
cells contribute to cardiomyocyte renewal, researchers have performed 
an elegant genetic fate-mapping experiment in transgenic mice”, 
akin to those previously described in the zebrafish model, in which 
cardiomyocytes were indelibly labelled after a tamoxifen pulse 
(Box 1). This system allowed the authors to distinguish between 
cardiomyocyte renewal from pre-existing (and therefore fluorescently 
labelled) cardiomyocytes and cardiomyocyte renewal from unlabelled 
progenitor cells. Interestingly, they found no significant contribution 
by such progenitor cells during normal ageing, up to one year after 
tamoxifen treatment. However, they observed a reduction in the 
fraction of labelled cardiomyocytes after infarction, indicating 
dilution by unlabelled progenitor cells. When combined with the 
findings that the rate of cardiomyocyte proliferation is very low in 
both normal and injured rodent hearts, these data indicate that the 
limited endogenous reparative mechanisms in the adult mammalian 
heart operate differently from those in zebrafish, and depend more on 
replenishment by cardiomyogenic progenitor cells than on replacement 
by cardiomyocyte proliferation. 

A recent report suggests that these differences between mammalian 
and fish hearts do not necessarily apply earlier in development”. 
Borrowing approaches from the zebrafish model, the authors resected 
the left ventricular apex of one-day-old neonatal mice and observed a 
brisk regenerative response reminiscent of that in the adult zebrafish. 
By three weeks after injury, the defect had been replaced by normal 
myocardial tissue, which showed normal contractile function by eight 
weeks. Genetic fate-mapping studies indicated that this regeneration 
was mediated by the proliferation of pre-existing cardiomyocytes, 
again as in the zebrafish. Notably, this regenerative capacity was not 
observed in seven-day-old mice, suggesting that its loss may coincide 
with cardiomyocyte binucleation and reduced cell-cycle activity. 
Nonetheless, in addition to representing a surgical tour de force, this 
study indicates that zebrafish-like regenerative mechanisms are latent 
in mammalian hearts. It also provides a genetically tractable model for 
dissecting the blocks to these mechanisms in the mammalian adult. 


The evidence for human heart regeneration 
Before addressing whether new cardiomyocytes are generated in 
the human heart after injury, it is instructive to review a few points 
about normal cardiac growth and adaptation to workloads (Box 2). In 
brief, most human cardiomyocyte nuclei are polyploid by the onset of 
puberty™. In response to pathological workloads, such as hypertension, 
valvular disease and post-infarction overload, human cardiomyocytes 
commonly reinitiate DNA synthesis without nuclear division”. 
This increases cardiomyocyte nuclear ploidy further, reaching levels 
as high as 64n (in which n represents the haploid set). Unlike rodent 
cardiomyocytes”, most human cardiomyocytes seem to remain 
mononucleated throughout life”. Thus, DNA synthesis is common in 
the adult human heart, but this cannot be equated to cardiomyocyte 
proliferation without accounting for the process of polyploidization. 
Historically, regenerative responses have been detected by either 
the macroscopic regrowth of the tissue or the microscopic presence 
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of mitosis. Macroscopic regeneration of the human heart clearly does 
not occur. Mitosis occupies only ~2% of the cell cycle, making it hard 
to quantify meaningfully. Experiments have confirmed this, with some 
investigators reporting no mitosis after injury, and others reporting 
rare (and potentially abnormal) mitotic figures around the injured 
site’ °°. One factor contributing to these discrepancies has been the 
inherent difficulty in recognizing cardiomyocyte nuclei in conventional 
histological sections'*. Many of the published images are persuasive for 
cardiomyocyte mitosis and provide important evidence that this can 
occur in humans. However, extrapolation to organ turnover rates from 
such low numbers is perilous. 

Several investigators have taken the approach of counting the number 
of cardiomyocytes in the human heart during normal and pathological 
growth*”>*!, but this method is surprisingly difficult and requires 
many assumptions. Using a combination of meticulous dissection, 
histopathology, biochemical measurements of tissue DNA content and 
fluorescent analysis of individual nuclear DNA content, researchers have 
shown that from myocardial weights of 50-350 g, the cardiomyocyte 
nuclear number is steady at ~2 billion. Beyond that, there is a linear 
increase in nuclear number with increasing heart weight, reaching 
4 billion cardiomyocyte nuclei in hypertrophied hearts weighing 
700-900 g. The number of non-cardiomyocytes such as fibroblasts and 
vascular cells increases linearly with heart weight throughout life. If 
correct, these data indicate that cardiomyocyte renewal occurs during 
pathological hypertrophy. An important caveat is that the assignment 
of cardiomyocyte versus non-cardiomyocyte nuclear identity was based 
on the size and morphology of isolated nuclei. Because we know that 
postnatal growth and pathological hypertrophy are accompanied by 
increases in nuclear ploidy (and hence in size), it is possible that diploid 
cardiomyocyte nuclei in smaller hearts were mistakenly classified 
as non-cardiomyocyte nuclei. Increases in cardiomyocyte nuclear 
number in pathological hypertrophy have also been reported”, using 
histological sections in which the cardiomyocyte nuclei can be more 
readily identified. 

Two studies have attempted to use more direct means to measure 
the rates of cardiomyocyte DNA synthesis in human hearts. The first 
approach, by Bergmann et al.**, was based on the worldwide pulse of 
“C that occurred during the atmospheric testing of nuclear weapons in 
the cold war. The atmospheric “C became incorporated into plants and 
entered the human food chain, labelling the DNA of dividing cells. After 
the Limited Nuclear Test Ban Treaty of 1963, atmospheric MC levels 
dropped rapidly. This provided researchers with pulse-chase conditions 
that can be used to date cells, simply by identifying when atmospheric 
'C levels match those of the DNA. As expected, non-cardiomyocytes in 
the normal human heart were found to be substantially younger than the 
patient, with ~18% turnover per year and a mean age of only four years. 
Notably, DNA from isolated cardiomyocyte nuclei (sorted by nuclear 
troponin staining) was also younger than the patient, although not 
nearly as young as that from non-cardiomyocytes. As indicated earlier, 
before one can infer cell division, it is essential to rule out a contribution 
from polyploidization. To do this, the authors sorted cardiomyocyte 
nuclei by DNA content, and analysed only the diploid DNA subset. 
The diploid cardiomyocyte nuclei were also younger than the patient, 
providing good evidence for cardiomyocyte division. Mathematical 
modelling suggested that cardiomyocyte renewal was age-dependent, 
with ~1% of cardiomyocytes being renewed per year at age 20, and 0.4% 
at age 75. On the basis of these kinetics, ~45% of cardiomyocytes would 
be predicted to be renewed over a normal human lifespan, whereas 55% 
would be cells persisting since birth. 

In the second approach, by Kajstura et al.”’, the rates of cardiac 
DNA synthesis were obtained by examining post-mortem hearts from 
patients with cancer that had been treated with the thymidine analogue 
iododeoxyuridine (IdU). This agent is incorporated into nascent DNA, 
where it sensitizes cells to radiation therapy. IdU was given as bolus 
injections or multiweek infusions, and the time between treatment and 
death ranged from 7 days to 4.3 years. Using immunohistochemistry to 
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Fetal hearts in humans and rodents grow through the proliferation 
of mononucleated cardiomyocytes with diploid nuclei. In the first 
few days after birth, rodent cardiomyocytes withdraw from the cell 
cycle. By contrast, human cardiomyocytes seem to proliferate for the 
first few months after birth, after which replication slows markedly. 
The cells of both rodents and humans then undergo a period of 
physiological growth, increasing in size by 30-40-fold. Although most 
cardiomyocytes cannot grow this much with a single diploid genome, 
different species have taken varying approaches to solve this problem. 
Nearly all rodent cardiomyocytes undergo a final round of DNA 
replication followed by nuclear division without cytokinesis, resulting 
in a heart with more than 75% binucleated cells with a normal diploid 
(2n) content of DNA in each nucleus (see Figure, a)*°. In humans 
and other primates, most cardiomyocytes undergo a final round of 
DNA replication, without nuclear division or cytokinesis, resulting in 
mononucleated cells®” with tetraploid (4n) or higher DNA content (a)*?. 
(Estimates of human cardiomyocyte binucleation rates range from 
25% in enzymatically dispersed fresh tissue”’ to more than 60% in 
potassium-hydroxide-digested formaldehyde-fixed tissue”’. We favour 
the 25% value, because the harsh potassium hydroxide digestion 
may selectively eliminate smaller mononucleated cardiomyocytes.) 
Notably, pacemaker cells of the sinoatrial and atrioventricular nodes 
remain small and diploid throughout life?©. 

As the myocardial mass of the human heart increases, the percentage 
of diploid cardiomyocyte nuclei decreases steadily. This has been 
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demonstrated by cytofluorometric analysis of human cardiomyocyte 
nuclear DNA content as a function of myocardial weight, after carefully 
removing valves, vessels and fat (see Figure, b). (Hearts were grouped 
by weight into four bins for clarity.) Studies of paediatric human hearts 
indicate that polyploidization occurs in the pre-adolescent growth 
phase, from 8 to 12 years of age. Tetraploid nuclei (4n) are most 
common in the adult heart. During cardiac hypertrophy, octaploid nuclei 
(8n) become most common, with a substantial number of 16n nuclei 
or those with higher polyploidy. These data demonstrate that human 
cardiomyocytes have a substantial capacity for DNA replication. 
Morphometric analysis has shown that the normal human adult 
number of 2 billion cardiomyocyte nuclei is reached by about 
2 months of age. During physiological hypertrophy, the cardiomyocyte 
nuclear number remains steady. However, when the heart weight 
exceeds approximately 450 g (myocardial weight roughly 210 g), 
there seems to be a linear increase in cardiomyocyte nuclear number 
with increasing cardiac mass (see Figure, ¢). Because human hearts 
do not change nuclear number with hypertrophy, this is evidence 
for the generation of new cardiomyocytes, either from pre-existing 
cardiomyocytes or from stem cells. Non-cardiomyocyte nuclei 
increase linearly with increasing myocardial mass (see Figure, d), 
indicating that proliferation of these cells accompanies all phases of 
cardiac growth. The arrows in b—-d denote the upper limits of normal 
for human myocardial weight. The data in panels b—d are derived from 
ref. 31; the trend line in ¢ is hand drawn for illustration purposes only. 


detect the IdU signal and identify cardiomyocytes, the researchers found 
remarkably high rates of cardiomyocyte DNA labelling, ranging from 
2.5% to 46%. No IdU staining was found in control hearts from patients 
without cancer who had not been exposed to the radiosensitizer. 
Mathematical modelling suggested that cardiomyocytes turn over at 
arate of 22% per year, compared with 20% for fibroblasts and 13% for 
endothelial cells. Furthermore, 83% of the cardiomyocyte nuclei were 
reported to be diploid, suggesting that this turnover reflects cell division, 
not increased nuclear ploidy. 

It is hard to reconcile these two studies, which differ by nearly 50-fold 
in their estimates of cardiomyocyte turnover. An important difference 
seems to be related to the higher rates of cardiomyocyte DNA synthesis 


activity in IdU-treated patients with cancer. Kajstura et al.” reported 
threefold lower DNA synthesis rates (based on immunolabelling of 
the cell proliferation marker Ki-67) in control hearts from patients 
without cancer than in hearts from IdU-treated patients with cancer. 
Neither study adequately rules out a contribution from DNA repair, 
which can masquerade as DNA replication in these assays. This is of 
particular concern in patients with cancer receiving radiation treatment 
plus a radiosensitizer. Kajstura et al. suggested that only senescent 
cardiomyocyte nuclei contain troponin, which could bias the turnover 
studies of Bergmann and his colleagues™ towards low proliferation. 
However, a follow-up paper by Bergmann et al.** provided evidence 
that nearly all cardiomyocyte nuclei were identified by troponin 
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staining. Furthermore, the findings of Kajstura et al. contradict two 
well-accepted principles. First, the findings suggest that more than 80% 
of cardiomyocyte nuclei are diploid, in contrast to most other reports 
that suggest they are polyploid. If the nuclei were in fact polyploid, then 
polyploidization could underlie the authors’ high estimates of DNA 
synthesis. Second, the authors conclude that cardiomyocytes are as 
proliferative as non-cardiomyocytes, whereas most other investigators 
find greater orders of magnitude of proliferation in non-cardiomyocytes. 
Indeed, the cardiomyocyte IdU-incorporation rate (2.5-46%) detected 
by Kajstura et al.” approaches the rate reported previously for sarcomas 
targeted by IdU (50-70%)*”. The heart does not proliferate like a sarcoma, 
so these cardiac IdU-incorporation estimates must be too high. 

Taken together, these human studies provide strong evidence for 
plasticity in the adult human heart. There is extensive morphometric 
evidence for DNA synthesis and an increase in cardiomyocyte number 
in diseased human hearts. Cardiomyocyte division or generation from 
progenitor cells probably occurs in the human heart, but it seems 
to be a very slow process. We need better tools to study this process 
quantitatively, and better ways to model it, if we hope to exploit it 
therapeutically. 


Stem cells and cell therapy 

Stem-cell biology is one of the fastest moving areas of biomedical 
research, and among all of the solid organs, the heart has one of the most 
active regeneration research programmes. The field can be conceptually 
organized into work involving endogenous and exogenous cells. The 
many exogenous cell types can be further divided into pluripotent cells 
(such as embryonic stem cells (ESCs) and induced pluripotent stem cells 
(iPSCs)) and adult cells of more limited potential (such as circulating 
progenitor cells, resident cardiac progenitor cells and cells native to 
other tissues). Here we focus on the cells closest to clinical trials and 
those for which there are the most reliable data. 


Cardiac progenitor cells 

Several investigators have reported resident populations of cardiac 
progenitor cells (CPCs) in postnatal hearts. These were identified using 
a variety of approaches, including studying the expression of surface 
markers such as c-KIT or SCA-1 (also known as LY6A; note that SCA-1 
has no apparent human orthologue) and physiological properties such 
as the ability to efflux fluorescent dye or form multicellular spheroids 
(reviewed in refs 38 and 39). Initially, it seemed that there was little 
overlap among CPCs identified by the different methods, and some 
scientists suggested that several populations of CPC exist. More recent 
studies indicate shared markers among once-distinct populations or 
different stages of maturation in the same line of cells**"", so the field 
may be converging. 

CPCs expressing the tyrosine kinase receptor c-KIT are the most 
extensively studied. In the human adult, c-KIT is expressed by telocytes 
(formerly known as the interstitial cells of Cajal), the thymic epithelium 
and mature circulating cells such as haematopoietic stem cells and mast 
cells. Immature endothelial cells and cardiomyocytes also express c- KIT 
during development”. Small round cells expressing c-KIT have been 
identified in the perivascular compartment of the adult heart, and 
their abundance increases in human heart failure’. After isolation 
from rat and human hearts, c-KIT" cells have been reported to give rise 
to cardiomyocytes, smooth muscle cells and endothelial cells. Some 
studies indicate that, when transplanted, c-KIT" cells induce large-scale 
regeneration of myocardial infarcts and contribute to the formation 
of new myocardium and vessels“, whereas others suggest smaller- 
scale regeneration”. On the basis of these data, a clinical trial is under 
way, testing the safety and feasibility of autologous c-KIT" cells as an 
adjunctive treatment for patients undergoing coronary bypass surgery 
(Clinical Trials.gov identifier NCT00474461). 

Not all studies with c-KIT* CPCs gave robustly positive results. In 
studies with genetic read-outs for lineage tracing and differentiation 
state, c-KIT* cells from the adult mouse heart have not been shown 
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to differentiate into cardiomyocytes in vitro or after transplantation 
into infarcted hearts**. Another study using transgenic reporter mice 
found no evidence to suggest that endogenous c-KIT* cells differentiate 
into cardiomyocytes, although re-expression of c-KIT in pre-existing 
cardiomyocytes was identified after injury”. Others point out that 
myocardium, like all solid tissues, contains mast cells. Mast cells are 
small round cells that reside in clusters in the perivascular space, 
strongly express c-KIT and increase in number in failing hearts”. 
Studies in humans suggest that 90-100% of all of the cardiac c-KIT* 
cells are actually mast cells. However, expansion in culture seems to 
select for c-KIT" cells that lack mast-cell markers, indicating that freshly 
isolated cells and cultured cells are different populations™. 

Another CPC population in clinical trials is cardiosphere-forming 
cells. These cells are isolated on the basis of their ability to migrate 
out of cultured cardiac tissue fragments and form spheroids in 
suspension cultures*””*. As one might predict, this yields a mixture 
of cells, some of which express stem-cell markers such as c-KIT, and 
others that seem to come from the stromal-vascular compartment. 
CPCs have been reported to give rise to cardiomyocytes in vitro and 
in vivo after transplantation, and to enhance cardiac function after 
infarction®. On the basis of these data, a clinical trial of autologous 
CPCs has been initiated for patients with recent myocardial infarctions 
(NCT00893360). The ‘stemness’ of CPCs has recently been questioned, 
and it has been suggested that these cells are principally cardiac 
fibroblasts and that CPC-derived cardiomyocytes are contaminants 
derived from the original tissue”. 

Thus, although the study of CPCs is an exciting, new area of cardiac 
research, it is also one of the most controversial. Most of the work has 
focused on cell culture and transplantation, driven by the clinical need 
for cardiac repair. We know almost nothing about the endogenous 
behaviour of CPCs, however. An important question remains about 
the role of these cells in development, homeostasis, ageing and reaction 
to injury. The field needs models that permit unambiguous tracing of 
CPC lineage and phenotype without resorting to transplantation or 
cell culture (Box 1). 


Bone marrow cells 

Considerable interest in bone-marrow-derived cells for cardiac 
repair was prompted by reports of haematopoietic stem cells 
transdifferentiating into cardiomyocytes”. Subsequent studies have 
shown that haematopoietic stem cells do not form cardiomyocytes 
but instead become mature blood cells after transplantation”. 
Nevertheless, animal studies show improvements in ventricular 
function when haematopoietic cells are administered after infarction, 
implicating paracrine signalling as the major mechanism of action. 

Work with marrow-derived stromal cells (MSCs) has followed a 
similar trajectory. MSCs were originally reported to transdifferentiate 
into cardiomyocytes” but are now thought to exert their main actions 
ina paracrine manner through the release of cytokines”. Interestingly, 
most MSCs die within days or weeks of transplantation into infarcts, 
yet their beneficial effects can be seen long term, suggesting a critical 
window of time for the action of MSCs after infarction. MSCs probably 
operate by many mechanisms, but considerable evidence points towards 
regulation of the WNT pathway. MSCs secrete antagonists of canonical 
WNT ligands, such as secreted frizzled related protein 2 (ref. 56). 
Blocking the production of WNT antagonists limits the beneficial effects 
of mouse MSCs. A recent report has shown that the administration of 
MSCs to pig infarcts stimulated endogenous CPCs to contribute to the 
repair of the infarcts”’. Further identification of paracrine mediators 
may allow the development of simpler, cell-free treatments based on 
proteins or small molecules. 

Clinical trials have mostly focused on the delivery of bone marrow 
mononuclear cells by the coronary circulation. It should be emphasized 
that >99.9% of bone marrow mononuclear cells are not stem cells, but 
are committed, although immature, granulocytes or other haemato- 
poietic lineages. These trials indicate that the delivery of bone marrow 
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derivatives through the coronaries is feasible and safe, but the benefits 
are modest. MSCs are also in clinical trials (NCT00587990). There are 
few published results with these cells, but one of the strongest cardiac- 
repair treatment effects seen so far (a 14% improvement in ejection 
fraction — the fraction of blood ejected from the left ventricle during 
one contraction) was reported after the intracoronary administration of 
large numbers of autologous MSCs”. Allogeneic MSCs administered to 
patients intravenously within ten days of infarction were well tolerated 
and were associated with decreased arrhythmias and an improvement 
in some indices of contractile function”. 

Taken together, the best current evidence indicates that bone marrow 
cells do not work by directly differentiating into new cardiomyocytes. 
Instead, the cells have been shown to elaborate signals that control 
the response of cells native to the myocardium, and thereby regulate 
healing. Although many view this as a novel aspect of stem-cell biology, 
students of pathology will recognize that this phenomenon fits under 
amore familiar heading: inflammation. We find it useful to consider 
the participation of marrow derivatives in cardiac repair as part of 
the inflammatory response, which is known to regulate angiogenesis, 
cardiomyocyte survival and left ventricular remodelling after infarction. 


Pluripotent stem cells 

Many types of adult stem cell are unable to generate large numbers of 
unambiguous cardiomyocytes. This limitation does not apply to ESCs 
or their more recently developed ‘man-made’ counterpart, iPSCs. 
Because both ESCs and iPSCs can be propagated indefinitely, while 
still retaining the capacity to differentiate into almost all cell types, they 
are a potentially inexhaustible supply of human cardiomyocytes. Our 
current thinking about how cardiomyocytes arise from ESCs is shown 
in Fig. 1. Human ESC-derived cardiomyocytes express early cardiac 
transcription factors such as NKX2.5, as well as the expected sarcomeric 
proteins, ion channels, connexins and calcium-handling proteins 
(Fig. 2). They show similar functional properties to those reported 
for cardiomyocytes in the developing heart, and undergo comparable 
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mechanisms of excitation—contraction coupling and neurohormonal 
signalling® ©“. Although human ESC-derived cardiomyocytes have 
been more intensively studied, data indicate that human iPSC-derived 
cardiomyocytes have a very similar phenotype®. Importantly, 
cardiomyocytes from either pluripotent stem-cell type are immature 
and so lack the expression profile, morphology and function of adult 
ventricular cardiomyocytes. 

The cardiac potential of ESCs and iPSCs is indisputable, but their 
unique origin and pluripotency presents a new set of challenges. 
ESCs are derived from the inner cell mass of preimplantation-stage 
blastocysts”, and this contributes to the ethical controversy surrounding 
their use. Moreover, ESC-based therapies will be allogeneic and 
require immunosuppression. iPSCs were originally generated by the 
reprogramming of adult somatic cells such as dermal fibroblasts by 
the forced expression of up to four stem-cell-related transcription 
factors” ©. As such, their derivation does not involve the destruction 
of embryos, and they could be used in autologous cell therapies. 
Nonetheless, first-generation iPSCs were problematic because the 
reprogramming factors were introduced using integrating viruses, 
raising concerns about neoplastic transformation. More recently, there 
have been a variety of refinements to iPSC generation that should 
reduce or eliminate this risk, including the use of episomal gene 
delivery, excisable transgenes, cell-permeable recombinant proteins and 
synthetic messenger RNA (see ref. 70 and references therein). Perhaps 
most notably, several small molecules have been shown to greatly 
enhance the efficiency of reprogramming” , inviting speculation that 
iPSCs may be generated using such factors alone in the near future. 
Further work will be required to more precisely define the phenotype 
and maturation potential of cardiomyocytes derived from iPSCs 
generated by these methods. 

Another concern relating to the clinical application of pluripotent 
stem cells is their capacity to form teratomas after transplantation”. 
To overcome this, the field needs to develop methods to enrich ESC 
and iPSC derivatives for cardiomyocytes or other useful cell types 
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Figure 1 | Cardiovascular lineages during embryonic development and 

ESC differentiation. Cardiac differentiation from ESCs closely mimics 
cardiac development in the embryo. In either case, the specification of the 
cardiovascular lineages involves a transition through a sequence of increasingly 
restricted progenitor cells, proceeding from a pluripotent state to mesoderm 
and then to cells committed to cardiovascular fates. Growth factors that regulate 
fate choices are listed at branch points (blue), and key transcription factors and 
surface markers for each cell state are listed under the cell types (green). The 
growth factors are useful for directing the differentiation of ESCs, whereas the 
markers are useful for purifying cells at defined developmental states. Primitive 
cardiomyocytes in the embryonic heart tube and nodal or pacemaker cells 
show slow electrical propagation and a small cell size. By contrast, the eventual 
specification of working atrial and ventricular cardiomyocytes is accompanied 


by more rapid conduction, ion-channel remodelling and increased cell size. 
Although the field has made considerable progress towards determining 
the early events of cardiogenesis, a better understanding of how pacemaker 
and chamber-specific cardiac subtypes are formed is required for clinical 
applications. BMPs, bone morphogenetic proteins; CNTN2, contactin-2; 
CX, connexin; FOX A2, forkhead box protein A2; HCN4, potassium/ 
sodium hyperpolarization-activated cyclic nucleotide-gated channel 4; 
MESP, mesoderm posterior protein; MLC2a/v, myosin light chain 2a and/ 
or 2v; MYH, myosin heavy chain; NPPA, natriuretic peptide precursor A; 
NRGI, neuregulin 1; PDGE platelet-derived growth factor; PDGFR, PDGF 
receptor; SCN5A, sodium channel protein type 5 subunit a; SOX, SRY-related 
high-mobility-group box; TBX, T-box transcription factor; VEGE, vascular 
endothelial growth factor; VEGFR-2, VEGF receptor-2. 
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(such as endothelial, smooth muscle and stromal cells). Historically, 
ESCs have been differentiated by culture in three-dimensional 
aggregates known as embryoid bodies, in medium containing a high 
percentage of fetal calf serum. This method is poorly cardiogenic, 
and differentiated human embryoid bodies are typically composed of 
less than 1% cardiomyocytes”. More recently, our group and others 
have used insights from developmental biology to devise better 
controlled approaches in which human ESCs and iPSCs are treated 
with defined factors, resulting in highly enriched populations of 
cardiomyocytes’*”°. A common theme with such methods has been 
the manipulation of cardioinductive molecules belonging to the 
transforming growth factor-B superfamily — specifically, activin and 
the bone morphogenetic proteins (BMPs). Our group has reported 
a protocol involving the serial application of activin A and BMP4, 
for example, which reliably yields ~60-80% human ESC-derived 
cardiomyocytes in large-scale preparations (~10°-10” total cells)’”*””. 
Further refinements are possible by manipulating the WNT-6- 
catenin signalling pathway’*, which mediates biphasic effects on ESC 
cardiogenesis, promoting mesodermal induction early but inhibiting 
cardiogenesis late”. 

A complementary approach involves the isolation of mesodermal 
progenitor cells with a more restricted potential, such as cardiovascular 
progenitor cells that can differentiate into cardiomyocytes, smooth 
muscle cells and endothelial cells. Such multipotent progenitor cells 
have been identified in differentiating ESC cultures on the basis of 
their expression of transcription factors such as mesoderm posterior 
protein 1 (MESP1)*°, NKX2.5 (ref. 81) and ISL1 (refs 82 and 83). 
Arguably more useful for eventual clinical application are progenitor 
populations that can be sorted on the basis of their expression of a cell- 
surface marker, such as the cardiovascular progenitor cells marked by 
expression of vascular endothelial growth factor receptor-2 (VEGFR-2, 
also known as FLK1 and KDR)”. If such cells could be induced to self- 
renew, they would potentially be very useful for cardiac repair. 

Human ESC-derived cardiomyocytes have been shown to engraft in 
infarcted mouse, rat, guinea pig and pig hearts (Fig. 3), forming islands 
of nascent, proliferating human myocardium within the scar zone”***». 
This partial remuscularization was accompanied by beneficial effects 
on regional and global cardiac function’”*™, although some investigators 
have questioned whether these effects are sustained at later time 
points**. Notably, the mechanism (or mechanisms) underlying the 
observed improvements in contractile function remains unresolved. In 
the aforementioned rodent studies, most of the graft tissue was isolated 
from the host myocardium by means of scar tissue, which may prevent 
synchronous beating. Furthermore, these human cells, which fire in 
vitro at ~50—-150 beats per minute (b.p.m.)”, may not keep pace with the 
rapid rate of rats (~400 b.p.m.) and mice (~600 b.p.m.). If they cannot, 
then the observed salutary effects probably resulted from an indirect, 
paracrine mechanism, like those described above for adult cells. This 
also indicates that further beneficial effects on cardiac function may 
be possible after transplantation to a slower-rated recipient, such as a 
canine or porcine infarct model. 


Reprogramming fibroblasts to cardiomyocytes 

Fifteen years ago, researchers showed that fibroblasts could be 
transdifferentiated into skeletal muscle in vitro or in the injured heart 
by overexpressing the gene encoding the myogenic transcription 
factor, MyoD*’. Despite an intensive search by several groups, no 
comparable master gene for cardiac muscle was found, and interest in 
reprogramming waned. Spurred by the discovery of iPSCs, scientists 
have returned to this field, using combinations of transcription factors 
to reactivate core transcriptional networks of desired cell types. In an 
attempt to induce cardiac differentiation, researchers performed a 
systematic screen of 14 cardiac transcription factors for their ability 
to activate a cardiac-specific transgene — the Myhé promoter driving 
yellow fluorescent protein (YFP) expression — in cardiac fibroblasts™. 
The full cocktail activated fluorescence in ~1% of cells. A systematic 
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Figure 2 | Guided differentiation and phenotype of cardiomyocytes from 
pluripotent stem cells. a, Selected protocols for the guided differentiation 
of human ESCs and iPSCs into cardiomyocytes using chemically defined 
factors. The top timeline shows a protocol from our group in which 
differentiating cells are serially pulsed with activin A (AA) and BMP4 under 
monolayer culture conditions”. The middle timeline shows a protocol 

from ref. 75 that involves embryoid body (EB) formation in suspension 
cultures, and the application of several signalling molecules such as activin 
A, BMP4, basic fibroblast growth factor (bFGF), dickkopf-related protein 

1 (DKK1) and VEGF. The bottom timeline shows a protocol from ref. 76, 

in which embryoid bodies in suspension are continuously cultured in 
insulin-free medium (IFM) supplemented with prostaglandin L, (PGI,) and 
an inhibitor of p38 MAP kinase (MAPK). b, Representative human ESC- 
derived cardiomyocytes, differentiated using the monolayer protocol (top 
timeline in a), immunostained for a-actinin (red) and CX43 (green). Nuclei 
are shown in blue. c, Representative human iPSC-derived cardiomyocytes, 
differentiated using the monolayer protocol (top timeline in a), 
immunostained for a-actinin (green) and the transcription factor NKX2.5 
(red). d, Intracellular [Ca”*] ({Ca”*],) transients in a human ESC-derived 
cardiomyocyte before (black) or after (red) the application of diltiazem, an 
L-type Ca’*-channel blocker. The absence of [Ca”’] transients after diltiazem 
treatment indicates that extracellular Ca™ is required to initiate intracellular 
Ca”* release, just as in adult cardiomyocytes. F/F, denotes the change in 
fluorescence intensity. e, Human ESC-derived cardiomyocytes show the 
characteristic action-potential properties of either working chamber (top) 
or nodal (bottom) cardiomyocytes, indicating early subtype specification. 


winnowing yielded three transcription factors (MEF2C, GATA4 and 
TBX5) that activated the transgene in 20% of fibroblasts. About 4% 
of the cells expressed endogenous sarcomeric proteins such as cardiac 
troponin T, and only ~1% showed functional properties such as 
spontaneous beating. Thus, most of the YFP” cells were only partially 
reprogrammed, although their global gene expression patterns had 
shifted markedly from fibroblast to cardiomyocyte. 

While this manuscript was under review, a different method of 
reprogramming mouse embryonic fibroblasts to cardiomyocytes 
was reported”. This group used the ‘Yamanaka factors’ — OCT4 
(also known as POUS5F1), SOX2, KLF4 and c-MYC — to initiate 


© 2011 Macmillan Publishers Limited. All rights reserved 


__ MYH7,+ HuCent 


a-Actinin + nuclei 


Figure 3 | Grafts of human ESC-derived cardiomyocytes in the 
cryoinjured guinea-pig heart. Representative photomicrographs 
demonstrating substantial implants of human myocardium within the scar 
tissue. a, Using picrosirius red stain, the scar appears red, and viable tissue 
is green. b, The human origin of the graft myocardium was confirmed 

in an adjacent section by combined in situ hybridization, with a human- 
specific pan-centromeric (HuCent; brown) probe, and B-myosin heavy 
chain (MYH7; red) immunohistochemistry. c, Inset from b at higher 
magnification. The nuclear localization of the HuCent signal confirms the 
human origin of these cells. d, Immunostaining for a-actinin (red) highlights 
the sarcomeric organization of the graft cardiomyocytes. Nuclei have been 
counterstained with Hoechst 33342 (blue). 


reprogramming, but they blocked signalling through the JAK-STAT 
pathway, which is required for pluripotency in the mouse, and added 
the cardiogenic factor BMP4. These modifications yielded minimal 
generation of iPSCs, but instead activated the cardiac progenitor 
program and, within 2 weeks, generated substantial numbers of beating 
colonies. By 18 days after induction, approximately 40% of the cells 
expressed cardiac troponin T. The authors attributed the increased 
efficiency to the generation of highly proliferative progenitor cells, 
as opposed to the formation of cardiomyocytes with low proliferative 
potential. It should also be noted that this study used mouse embryonic 
fibroblasts, whereas the systematic screen of 14 transcription factors was 
principally in postnatal mouse cardiac fibroblasts. 

Reprogramming the scar-forming fibroblast to a cardiomyocyte is 
intuitively appealing, particularly if it can be done directly in the infarct. 
To succeed clinically, we need to know how normal these reprogrammed 
cardiomyocytes are, and the process will have to be much more efficient 
and transgene-free. Despite some challenges, this is an exciting avenue 
of research and could be a game changer. 


Tissue engineering 

Tissue engineering refers to the growth of three-dimensional tissues 
in vitro, with the aim of building more biologically relevant models 
for in vitro study or tissues for in vivo regenerative therapy. Most 
commonly, this involves the use of porous, biodegradable scaffolds 
onto which cells are seeded, but other approaches include casting cells 
into hydrogels or creating scaffold-free tissues composed only of cells 
and the matrix they secrete. Synthetic materials have big advantages in 
manufacturability, typically being easy, cheap and reproducible to make. 
However, synthetics generally have worse biocompatibility, because they 
cause foreign-body inflammatory reactions and, sometimes, release 
locally toxic degradation products. Bioreactors are often used in tissue 
engineering to provide electrical and mechanical conditioning or to 
deliver nutrients to the tissue by perfusion systems. 

A major aim of bioengineering is to improve the host response to 
biomaterials, in essence, to make materials that can heal”. Surprisingly, 
the chemical composition of a material does not have a major influence 
on how the body responds to it. Whether materials are organic or metallic, 
hydrophobic or hydrophilic, or positively or negatively charged, they all 
cause similar foreign-body reactions. Instead, what the body seems to 
sense is the surface topography ofa material”. When surfaces are smooth, 
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there is intense inflammation and scarring, creating a fibrotic capsule 
around the implant. Ifa surface is given a more complex topography, for 
example, by creating pores or grooves, there is less inflammation, scarring 
diminishes and blood vessels grow into the implant. Systematic variation 
in the topology can ‘tune’ this host response. For example, our tissue- 
engineering group has developed scaffolds with two compartments: 
cylindrical channels to generate cables of cardiomyocytes, surrounded 
by a network of smaller interconnected pores for stromal and vascular 
ingrowth”". The pores are optimally sized to maximize vascularization 
within the implant and minimize fibrosis around it. 

The cardiomyocytes used in tissue engineering have been immature 
cells derived from young animals or stem cells. To take on an adult 
workload, these cells will need to organize into the cable-like 
structure of myocardium and increase their size by more than 20-fold 
compared with the neonatal stage. There is a continuing debate in 
tissue engineering about whether this maturation should take place 
before or after transplantation. On the one hand, electrical” and 
mechanical” stimulation in vitro enhance hypertrophy, alignment and 
electromechanical function of rat cardiomyocyte constructs. On the 
other hand, greater cell differentiation is associated with worse survival 
after transplantation™, so there is probably a point of diminishing 
returns. This needs to be explored further experimentally. 

One of the big lessons from tissue engineering has come from studies 
comparing cardiomyocyte-only with mixed-cell constructs. When 
cardiomyocyte-only constructs are transplanted, the tissue survives 
poorly. When vascular endothelial cells together with a stromal cell 
population are included, the endothelial cells form networks resembling 
a primitive vascular plexus, and the stromal cells form a provisional 
matrix that enhances mechanical integrity””*. After transplantation, 
the endothelial network organizes into a definitive vascular network 
that connects to the host circulation, bringing blood flow into the tissue 
several days sooner than would otherwise be seen. Indeed, our group 
and others have demonstrated improved survival of prevascularized 
human myocardial constructs incorporating vascular and stromal 
elements compared with constructs containing cardiomyocytes 
alone”””. This indicates that there is considerable synergy to including 
vessels and connective tissue elements when engineering tissue. 

Tissue engineering has not been as extensively studied as cell 
transplantation in preclinical disease models, but initial studies are 
promising. A recent study” prepared constructs of engineered rat 
heart tissue from neonatal cardiomyocytes and conditioned them 
for several days using a cyclic stretch system. The constructs were 
sutured to the surface of rat hearts that had been infarcted two weeks 
previously, and were studied one month after implantation. Compared 
with infarcted hearts receiving non-contractile constructs, hearts 
receiving the engineered heart tissue had better contractile function, 
and interestingly, conduction velocities across the infarct were 
improved, probably because the grafts had electrically connected to 
the surrounding viable myocardium. Another group reported that 
patches generated from cardiosphere-derived CPCs can enhance heart 
function after infarction”, and there are hints that tissue engineering 
also provides a larger graft size compared with cell transplantation. 


Perspective 

After more than a decade of furious activity, the science of stem cells 
seems to be catching up with its promise. Clinical-scale preparations of 
the main cardiac cell types can now be generated, and we are learning the 
rules for building myocardium and keeping it alive after transplantation. 
Clinical trials have established techniques for cell delivery, and protocols 
for establishing feasibility, safety and early-stage efficacy in humans are 
in place. The first patient trials have demonstrated safety with hints of 
efficacy. So far, so good. 

That said, many short- and long-term challenges remain. In the near 
term, it will be important to derive the right subtype of cardiomyocyte, 
for example, ventricular cardiomyocytes that are free of pacemaker 
cells for repair of an infarct. The major challenge facing the field of 
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adult CPCs is to develop protocols with higher yields of definitive 
cardiomyocytes. Researchers studying pluripotent stem cells need to 
identify the optimal stage of differentiation and demonstrate that these 
cells can be used without tumorigenesis. The question of allogeneic 
versus autologous cells remains open. Although desirable, autologous 
cells will be more expensive, more variable, and the time needed to 
expand them precludes their use in any acute setting. Allogeneic cells 
will provide the only off-the-shelf product, but we need to learn how 
best to manage the immune response to prevent their rejection. All 
of these efforts will be advanced by improvements in integration of 
the graft, including control of vascularization (growth of both arterial 
conduits and microvasculature), inflammation and scarring. 

Further ahead, in situ manipulation of cells in the heart may allow us 
to control their fates, thereby obviating transplantation. For example, 
it may be possible to control CPCs using small molecules or growth 
factors to enhance their regenerative abilities. Fibroblasts in infarcts 
could potentially be reprogrammed directly to cardiomyocytes. Given 
our increasing ability to control the fates of cells and tissues, the debate 
over whether the heart is intrinsically terminally differentiated seems 
anachronistic, for the heart does not exist apart from the person who 
knows how to manipulate it. Itis more useful to ask what we can do to 
promote cardiac regeneration best, and then do it. m 
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Gene expression is a multistep process that involves the transcription, translation and turnover of messenger RNAs and 
proteins. Although it is one of the most fundamental processes of life, the entire cascade has never been quantified on a 
genome-wide scale. Here we simultaneously measured absolute mRNA and protein abundance and turnover by parallel 
metabolic pulse labelling for more than 5,000 genes in mammalian cells. Whereas mRNA and protein levels correlated 
better than previously thought, corresponding half-lives showed no correlation. Using a quantitative model we have 
obtained the first genome-scale prediction of synthesis rates of mRNAs and proteins. We find that the cellular abundance 
of proteins is predominantly controlled at the level of translation. Genes with similar combinations of mRNA and protein 
stability shared functional properties, indicating that half-lives evolved under energetic and dynamic constraints. 
Quantitative information about all stages of gene expression provides a rich resource and helps to provide a greater 


understanding of the underlying design principles. 


The four fundamental cellular processes involved in gene expression 
are transcription, mRNA degradation, translation and protein degra- 
dation. It is now clear that each step of this cascade is controlled by 
gene-regulatory events’. Although each individual process has been 
intensively studied, little is known about how the combined effect of all 
regulatory events shapes gene expression. The fundamental question of 
how genomic information is processed at different levels to obtain a 
specific cellular proteome has therefore remained unanswered. 

With regard to a quantitative description of gene expression, 
numerous previous studies comparing mRNA and protein levels con- 
cluded that the correlation is poor**. However, the available data 
suffer from several limitations. Most studies are limited to a few 
hundred genes, mainly due to the technical challenges involved in 
large-scale protein identification and quantification. Also, protein 
levels measured in one experiment are typically compared to 
mRNA levels determined in a different experiment performed at a 
different time in a different laboratory, making it difficult to interpret 
why the correlation is low. Finally, mRNA and protein levels result 
from coupled processes of synthesis and degradation. Therefore, ana- 
lysis of mRNA and protein levels alone cannot provide sufficient 
information to understand gene expression comprehensively. 
mRNA and protein turnover can be measured with drugs to inhibit 
transcription or translation®®, but this has severe side effects. Studies 
based on artificial fusion proteins are problematic because tagging can 
affect protein stability’. 

To overcome these limitations we sought to quantify cellular mRNA 
and protein expression levels and turnover in parallel in a population 
of unperturbed mammalian cells. Pulse labelling with radioactive 
nucleosides or amino acids is regarded as the gold standard method 
to determine mRNA and protein half-lives. Recently, variants of this 
approach based on non-radioactive tracers have been established*””. 
In stable isotope labelling by amino acids in cell culture (SILAC), cells 
are cultivated in a medium containing heavy stable-isotope versions of 
essential amino acids'’. When non-labelled (that is, light) cells are 
transferred to heavy SILAC growth medium, newly synthesized proteins 
incorporate the heavy label while pre-existing proteins remain in the 


light form. This strategy can be used to measure protein turnover'* “* or 


relative changes in protein translation’*”’. Similarly, newly synthesized 
RNA can be labelled with the nucleoside analogue 4-thiouridine (4sU). 
4sU-containing mRNA can be purified and compared with the pre- 
existing fraction to compute mRNA half-lives’®. 


Pulse labelling of proteins and mRNAs 


We used parallel metabolic pulse labelling with amino acids and 4sU 
to measure simultaneously protein and mRNA turnover in a popu- 
lation of exponentially growing non-synchronized NIH3T3 mouse 
fibroblasts (Fig. 1a). Protein samples were collected at three time 
points, measured by liquid chromatography and online tandem mass 
spectrometry (LC-MS/MS) and analysed with the MaxQuant soft- 
ware package’’. We identified 84,676 peptide sequences and assigned 
them to 6,445 unique proteins (false discovery rate <1% at the peptide 
and protein level). A total of 5,279 of these proteins was quantified by 
at least three heavy to light (H/L) peptide ratios (Fig. 1b). Tissue- 
specific amino acid precursor pools and recycling rates, a pervasive 
problem for in vivo pulse labelling experiments”'*”’, did not appre- 
ciably affect our results (Supplementary Fig. 1). For constant incorp- 
oration rates the logarithm of H/L ratios should increase linearly with 
time (Fig. 1c). Ninety-three per cent of proteins showed excellent 
linear correlation indicated by a variability of the linear regression 
slope smaller than 1% (Fig. 1d). Protein abundance did not influence 
H/L ratio measurements (Supplementary Fig. 2). In total, we obtained 
a confident set of 5,028 protein half-lives calculated from the slope of 
the regression line. Cycloheximide chase experiments for selected 
proteins spanning a representative range of half-lives agreed well with 
half-lives determined by pulsed labelling and mass spectrometry 
(Supplementary Fig. 3). In parallel, we pulse labelled newly synthe- 
sized RNA for 2h with 4sU. RNA samples were fractionated into the 
newly synthesized and pre-existing fractions. Both fractions and the 
total RNA sample were analysed by mRNA sequencing and quantified 
by mapping reads to their exonic region”. We calculated mRNA half- 
lives based on the ratios of newly synthesized RNA/total RNA ratio 
and the pre-existing RNA/total RNA”. 
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Figure 1 | Parallel quantification of mRNA and protein turnover and levels. 
a, Mouse fibroblasts were pulse labelled with heavy amino acids (SILAC, left) 
and the nucleoside 4-thiouridine (4sU, right). Protein and mRNA turnover was 
quantified by mass spectrometry and next-generation sequencing, respectively. 
b, Mass spectra of peptides from a high- and low-turnover protein reveal 


Proteins were, on average, five times more stable (median half-life of 
46 h) than mRNAs (9 h) and spanned a bigger dynamic range (Fig. 2a). 
Because very long (>200h) and very short (<30 min) protein half- 
lives cannot be accurately quantified from our three time points, the 
true dynamic range of protein stabilities may be even higher. Notably, 
we found no correlation between protein and mRNA half-lives (Fig. 2c, 
R? = 0.02, log-log scale). 


Absolute mRNA and protein copy numbers 


We calculated absolute cellular mRNA copy numbers based on the 
number of sequencing reads in the unfractionated sample in conjunction 
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Figure 2 | mRNA and protein levels and half-lives. a, b, Histograms of 
mRNA (blue) and protein (red) half-lives (a) and levels (b). Proteins were on 
average 5 times more stable and 900 times more abundant than mRNAs and 
spanned a higher dynamic range. c, d, Although mRNA and protein levels 
correlated significantly, correlation of half-lives was virtually absent. 
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Harvesting time point Variability of linear regression slope (%) 


increasing heavy to light (H/L) ratios over time. c, Protein half-lives were 
calculated from log H/L ratios at all three time points using linear regression. 
d, Variability of linear regression slopes assessed by leave-one-out cross- 
validation was small. 


with information on cellular mRNA content*’. Absolute protein copy 
numbers can be inferred from mass spectrometry data*’”*. To this end, 
we used the sum of peak intensities of all peptides matching to a specific 
protein. When divided by the number of theoretically observable pep- 
tides, this value provides an accurate proxy for protein levels (‘intensity- 
based absolute quantification’ or iBAQ, see Supplementary Methods). 

Levels of detected proteins spanned approximately five orders of 
magnitude (Fig. 2b). Relatively few proteins had less than 100 copies 
per cell, indicating that some proteins of low abundance escaped 
detection. Indeed, we observed a moderate detection bias (Sup- 
plementary Fig. 4) and therefore restricted our analysis to genes that 
were identified at both the mRNA and protein level. In this subset, 
proteins were, on average, ~900 times more abundant than corres- 
ponding transcripts. Despite a huge spread, mRNA and protein levels 
were clearly correlated (Fig. 2d, R? = 0.41, log-log scale). This cor- 
relation is considerably higher than in any previous study in mam- 
mals**?*, An attempt to improve this correlation further by nonlinear 
transformation resulted only in a marginal increase (R? = 0.44, 
Supplementary Fig. 5). It seems that for our data set, this is about 
the maximum correlation between mRNA and protein that can be 
achieved without additional information. 


Reproducibility 

To investigate the experimental noise we performed a second inde- 
pendent large-scale experiment and measured mRNA and protein 
levels and half-lives again. The overall correlation of half-lives and 
levels between both replicates was good (Supplementary Fig. 6 and 
Supplementary Table 1). Removing less-consistent data points did 
not increase correlation between mRNA and protein levels or half- 
lives (Supplementary Fig. 7). Thus, noise has little impact on the 
observed correlation between mRNA and protein levels and half-lives. 
We also validated absolute mRNA and protein copy numbers using 
independent methods. For mRNA copy numbers we used the 
NanoString technology, which captures and counts individual tran- 
scripts without enzymatic reactions™. Correlation between sequen- 
cing and NanoString data was high (r = 0.79, see also Supplementary 
Fig 8a). Absolute protein quantification was validated by spike-in 
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experiments using a mixture of 48 proteins with known concentra- 
tions (Supplementary Fig. 8b). iBAQ values correlated well with 
known absolute protein amounts over at least four orders of mag- 
nitude and had a higher precision and accuracy than alternative mea- 
sures of absolute protein abundance (data not shown)*’”’. We also 
assessed degradation and synthesis rates for mRNAs and proteins by 
actinomycin D and cycloheximide treatment, respectively. For high 
turnover proteins and mRNAs we obtained results consistent with 
pulse labelling data (Supplementary Fig. 8c-f). 


A quantitative model of gene expression 


Our data allow us to calculate average synthesis rates of mRNAs and 
proteins for thousands of genes using a mathematical model (Fig. 3a 
and Supplementary Methods). The experimental data are based on a 
population of non-synchronized cells. Therefore, our estimated rates 
provide an average over the population and time. 

Average cellular transcription rates predicted by the model spanned 
two orders of magnitude with a median of about two mRNA molecules 
per hour (Fig. 3b). An extreme example was Mdm2 with more than 500 
mRNAs per hour. A microscopic study on the cytomegalovirus (CMV) 
promoter reported transcription termination rates of 5.8 to 8.7 mRNAs 
per hour’*. These values are above the median of our predictions, as 
perhaps expected for a strong promoter system. Next, we calculated 
translation rate constants; that is, how many proteins are made from 
each mRNA template per hour (Fig. 3c). We find a median translation 
rate constant of about 40 proteins per mRNA per hour. Several proteins 
involved in translational regulation—such as the translation initiation 
factor eIF4G1, fragile X syndrome related protein Fxr2 and tuberin— 
had extremely low rate constants and were translationally repressed. 
Plotting translation rate constants against protein levels revealed that 
abundant proteins are translated about 100 times more efficiently than 
those of low abundance (Fig. 3d). Hence, different translation efficiencies 
contribute to the higher dynamic range of proteins compared to mRNAs 
(Fig. 2b). Intriguingly, translation rate constants saturated at around 180 
protein copies per mRNA per hour. To our knowledge, the maximal 
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Figure 3 | Quantitative model of gene expression in growing cells. 

a, mRNAs are synthesized with the rate v,, and degraded with a rate constant 
kg. Proteins are translated and degraded with rate constants k,, and kay, 
respectively. b, Calculated mRNA transcription rates show a uniform 
distribution. c, Calculated translation rate constants are not uniform. 

d, Translation rate constants of abundant proteins saturate between 
approximately 120 and 240 proteins per mRNA per hour. Red line shows the 
locally weighted fit (Lowess). Dashed lines indicate 95% confidence intervals of 
the Lowess maximum value calculated by bootstrapping. 
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translation rate constant in mammals is not known. On the basis of ref. 1, 
the estimated maximal translation rate constant in sea urchin embryos 
is 140 copies per mRNA per hour, which is surprisingly close to the 
prediction of our model. 


Control of gene expression 


A long-standing question is how much protein abundance is con- 
trolled at the transcriptional, post-transcriptional, translational and 
post-translational levels. Until now, this has mainly been addressed 
indirectly by analysing mRNA and protein sequence features. Features 
related to translation initiation (for example, Shine-Dalgarno, Kozak 
and 3’ untranslated region (UTR) sequences), elongation (for example, 
codon bias) and protein stability (for example, degrons) have been ana- 
lysed and reported to correlate partially with protein/mRNA ratios in 
bacteria, yeast and mammals***”’. We also observed sequence features 
characteristic of mRNA and protein stability and found that mRNAs 
with long 3’ UTRs are, on average, less stable (Supplementary Fig. 9). In 
addition, the density of AU-rich elements and binding motifs of a spe- 
cific RNA-binding protein (pumilio 2) correlated negatively with 
mRNA stability (Supplementary Fig. 10). Highly structured proteins 
were more stable than unstructured ones (Supplementary Fig. 11a). 
We also identified amino acids over-represented in unstable proteins 
(Supplementary Fig. 11b). 

Sequence features are at best indirect proxies for mechanisms con- 
trolling protein abundance. How much efficiencies of different steps in 
the gene expression cascade contribute to variance of cellular protein 
copy numbers can only be revealed by direct parallel genome-scale 
measurements of mRNA and protein levels and half-lives which were 
not available previously. In our data the coefficient of determination 
(R*) between mRNA and protein copy numbers is 0.41 (Fig. 2d). 
Assuming the absence of technical and biological noise, this means 
that ~40% of the variance in protein levels is explained by mRNA 
levels—considerably more than previously thought (Fig. 4a). Most of 
this 40% is due to different transcription rates, whereas mRNA stability 
has a smaller role. Considering translation rate constants markedly 
boosts R? to 0.95. Thus, translation rate constants have the dominant 
role for control of protein levels. Unexpectedly, the impact of protein 
degradation is rather small. 

In the above analysis the same experimental data were used to 
calculate synthesis rates and to estimate their impact on protein levels. 
To avoid this over-fit and to assess reliability of the model predictions 
we performed the same analysis with data from the biological replicate 
experiment. In the replicate the coefficient of determination between 
mRNA and protein levels was 0.37 (Fig. 4b). We then used the model 
including the estimated parameters from the first experiment to pre- 
dict protein levels from mRNA levels in the replicate data. Predicted 
protein levels agreed very well with measured protein levels 
(R? = 0.85, Fig. 4c). Therefore, the model explains ~85% of the vari- 
ability in protein copy numbers in an independent experiment. The 
correlation is very similar to the direct comparison of protein levels in 
both experiments (R? = 0.84, Supplementary Fig. 6d). We conclude 
that technical and biological noise in our data are low, and that the 
model faithfully predicts protein levels from mRNA levels in mouse 
fibroblasts. It also indicates that the estimated impact of transcription, 
mRNA stability, translation and protein stability on protein abund- 
ance is reproducible. We finally assessed how much of the efficiencies 
of the various steps in gene expression are retained in a different cell 
type and organism. To this end, we quantified mRNA and protein 
abundance in the human breast cancer cell line MCF7 by RNA-seq 
and mass spectrometry, respectively. A total of 2,030 human genes 
from the MCF7 data set had orthologues in the mouse fibroblast data. 
We then used rates from the mouse fibroblast model to predict protein 
levels from mRNA levels in human breast cancer cells. In MCF7 cells, 
the model predicted ~60% of the variability in protein levels (Fig. 4a). 
Although the fraction explained by the model is smaller than in mouse 
fibroblasts, this indicates that translation and degradation rates are to 
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Figure 4 | Impact of different rates and rate constants on protein 
abundance. a, Protein levels are best explained by translation rates, followed by 
transcription rates. mRNA and protein stability is less important (left bar). b, In 
the replicate experiment mRNA levels explained 37% of protein levels in 
NIH3T3 cells (middle bar in a). c, The model explains 85% of variance in 
protein levels from measured mRNA levels (middle bar in a). The mouse 
fibroblast model has some predictive power for human orthologous genes in 
MCEF7 cells (right bar in a). Error bars show 95% confidence intervals estimated 
by bootstrapping. 


some extent independent of the cell type and conserved between mouse 
and human. It is noticeable, however, that the drop in prediction is 
mainly due to the fact that the translation part of the model performs 
less well. 


Half-lives and gene function 


Degradation of proteins is critically involved in many cellular processes 
including cell-cycle progression, signal transduction and apoptosis*~”. 
Similarly, mRNA stability is important for the temporal order of gene 
induction’®*’. Genes may have evolved specific combinations of mRNA 
and protein half-lives under functional constraints'®*'**. We therefore 
asked if genes with specific combinations of mRNA and protein stability 
have distinct biological functions. We grouped genes according to their 
half-lives and used gene ontology to find enriched biological processes 
(Fig. 5; see Supplementary Table 2 for a complete list). 

Genes with stable mRNAs and stable proteins were enriched in 
constitutive cellular processes like translation (that is, ribosomal 
proteins), respiration and central metabolism (glycolysis, citric acid 
cycle). Hence, many housekeeping genes tend to have stable mRNAs 
and proteins. In yeast energy costs keep transcription and translation 
rates under selective pressure*’. We reasoned that energy constraints 
may explain why housekeeping genes tend to have stable mRNAs and 
proteins. On the basis of the model, we calculated the theoretical 
energy required to maintain cellular mRNA and protein levels by 
recycling from their building blocks (nucleotide monophosphates 
and amino acids, respectively) in terms of high energy phosphates. 
This is a conservative estimate as splicing, folding and transport are 
not included. Protein synthesis consumes more than 90% of the 
energy whereas less than 10% is needed for transcription. A total of 
20% of the proteins consumed 80% of the energy for translation 
(Pareto principle or 80/20 rule). Consistent with optimization under 
energy constraints, abundant proteins were significantly more stable 
than less abundant ones (Supplementary Fig. 12a, P< 10°, 
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Figure 5 | Functional characteristics of genes with different mRNA and 
protein half-lives. Genes were grouped according to their combination of 
mRNA and protein half-lives and analysed for enriched gene ontology terms. A 
heat map of enrichment P-values reveals functional similarities of genes with 
similar combinations of half-lives. 


Wilcoxon test). This is not necessarily expected because the overall 
contribution of protein stability to protein levels is very small 
(Fig. 4a). In addition, abundant proteins were significantly shorter 
(Supplementary Fig. 12b). Shuffling protein half-lives and lengths 
markedly increased theoretical energy consumption (Supplementary 
Fig. 12c). Collectively, these observations indicate that mammalian gene 
expression evolved under energy constraints. 

The subset of genes with unstable mRNAs and proteins was strongly 
enriched in transcription factors, signalling genes, chromatin modifying 
enzymes and genes with cell-cycle-specific functions (Fig. 5). Because 
mRNAs and proteins are information carriers, their degradation can be 
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interpreted as a built-in timer that controls the persistence of genetic 
information”. It therefore makes intuitive sense that many regulatory 
genes have short mRNA and protein half-lives. However, it must be 
stressed that population-level data cannot provide information about 
individual cells or molecules. 

The group of genes with stable proteins but unstable mRNAs was 
strongly enriched in terms related to processing of mRNAs, tRNAs and 
non-coding RNAs. Hence, many mammalian RNA-binding proteins 
are stable whereas their encoding transcripts are short lived, as also 
found in yeast**. Because many RNA-binding proteins bind their own 
message”’, this observation is indicative of a post-transcriptional nega- 
tive feedback loop for RNA-binding proteins. Consistently, we found 
that unstable mRNAs are enriched for binding motifs of RNA-binding 
proteins (Supplementary Fig. 10). 

Finally, the subset of genes with stablemRNAs and unstable proteins 
was rich in extracellular proteins. This is expected, as secreted proteins 
havea short cellular half-life. Additionally, this group contains proteins 
involved in cellular homeostasis, defence response and proteolysis. This 
set contains two ferritin proteins that are rapidly upregulated in res- 
ponse to iron’’. Ferritins are classic examples of translationally regu- 
lated genes. As translational regulation is not dependent on mRNA 
half-lives, genes with stable mRNAs can still be dynamically regulated 
as long as their protein half-lives are short. It is tempting to speculate 
that other homeostasis genes in this group are regulated at the level of 
translation. 


Discussion 


Although gene expression is one of the most fundamental processes in 
biology it has never been quantified comprehensively. We provide the 
first analysis of mRNA and protein levels, half-lives, transcription 
rates and translation rate constants for thousands of genes. In the 
future, additional methods like sequencing of nascent transcripts 
and ribosome profiling may further refine this picture**”. 

We found that mRNA levels explain around 40% of the variability 
in protein levels. This fraction is higher than in previous studies on 
mammals**”*. We found that in mouse fibroblasts, translation effi- 
ciency is the single best predictor of protein levels. Hence, protein 
abundance seems to be predominantly regulated at the ribosome, 
highlighting the importance of translational control*”*". Whether this 
observation is valid in other cell types is not known. A recent study on 
embryonic stem cells revealed that changes in protein levels are not 
accompanied by changes in corresponding mRNAs™. It is also not 
clear how much translation rate constants change under different 
conditions. Our observation that the mouse model can to some degree 
predict levels of orthologous proteins in MCF7 cells suggests that 
translation efficiency is partially ‘hard-coded’ in the genome and is 
not subject to change. 

Compared to translational control, protein stability seems to have a 
minor role in cellular protein abundance in our system. This is sur- 
prising as protein degradation is involved in the regulation of many 
cellular processes***°. From the global perspective, the dominance of 
translational regulation makes sense given the high energy costs 
associated with protein synthesis. However, it should also be stressed 
that our data set represents average values derived from a population 
of dividing, non-synchronized cells. At the single cell level, the role of 
protein degradation for protein abundance may be higher. Similarly, 
protein degradation may be more important upon perturbation. 

Gene expression may follow certain design principles for optimal 
evolutionary fitness. Intriguingly, we found that genes with certain 
combinations of mRNA and protein half-lives share common func- 
tions, indicating that they evolved under similar constraints. One of 
these constraints may be energy efficiency’. Consistently, we observed 
that the theoretical energy needed for gene expression is much lower 
than random. A second constraint may be the ability of genes to 
respond quickly to a stimulus. We find that many transcription factors 
and genes with cell-cycle-specific function have unstable mRNAs and 
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proteins, predisposing them to rapid transcriptional and/or trans- 
lational regulation. In addition, genes with stable mRNAs but unstable 
proteins can be regulated quickly at the level of translation. These 
observations are consistent with the idea that many fast-responding 
genes have short protein and/or mRNA half-lives'’®*'***’. The global 
picture is that most mRNAs and especially proteins are stable unless 
genes need to respond quickly to a stimulus. Owing to the trade-off 
between dynamic regulation and energy efficiency, this may be an 
optimal design. 

Our data provide a rich resource for the scientific community that 

can be mined in many ways that are beyond the scope of this study 
(see Supplementary Table 3 for the entire data set). For example, we 
provide by far the largest data set on protein copy numbers, which 
contains valuable information for modelling of cellular processes and 
stoichiometry of protein complexes”. Half-lives of proteins and 
mRNAs can be used to search for properties of unstable mRNAs or 
proteins, and we provide a first analysis of characteristic sequence 
features (Supplementary Figs 9 and 10). Genome-scale quantitative 
data on absolute mRNA and protein levels and half-lives will certainly 
help to understand the complex relationships between thousands of 
genes and their products in biological systems. 
Note added in proof: While this paper was in revision, another paper“ 
reported that changes in mRNA levels in dendritic cells are mainly 
determined by transcription rates. This result is consistent with our 
findings in fibroblasts. Notably, mRNA half-lives reported in ref. 44 
are considerably shorter (see Supplementary Information for a brief 
discussion). 


METHODS SUMMARY 


NIH3T3 cells grown in light (L) SILAC medium were simultaneously pulse- 
labelled with heavy (H) amino acids and 4-thiouridine (4sU). For proteome 
analysis, proteins were extracted, separated by SDS-polyacrylamide gel electro- 
phoresis (PAGE), trypsin-digested and analysed by LC-MS/MS on high-resolution 
instruments (LTQ-Orbitrap XL and Velos, Thermo Fisher). Raw files were pro- 
cessed by MaxQuant (version 1.0.13.13) for peptide/protein identification and 
quantification. In total 3,588,163 fragment spectra led to 972,333 peptide identi- 
fications (84,676 unique peptide sequences) that were assigned to 6,445 unique 
proteins (false discovery rate of 1% at the peptide and protein level). Average 
absolute mass deviation was 0.29 parts per million (p.p.m.). Absolute protein 
amounts were calculated as the sum of all peptide peak intensities divided by 
the number of theoretically observable tryptic peptides (intensity based absolute 
quantification, or iBAQ). RNA was extracted and separated into newly synthesized 
and pre-existing fractions based on the incorporated 4sU. Total, pre-existing 
and newly synthesized RNA samples were processed according to an mRNA 
sequencing protocol (two rounds of oligo(dT) enrichment) and analysed on a 
Solexa GAIIX sequencing platform (36 cycles). Reads were mapped to the mouse 
genome reference sequence (mm49, July 2007) using SOAP2 with a maximum of 
two mismatches allowed. Only uniquely mapped reads were retained. For more 
details on data acquisition, processing, analysis and modelling see Supplementary 
Methods. 
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TET1 and hydroxymethylcytosine in 
transcription and DNA methylation fidelity 


Kristine Williams >?*, Jesper Christensen'?*, Marianne Terndrup Pedersen!*, Jens V. Johansen’, Paul A. C. Cloos!?, 


Juri Rappsilber* & Kristian Helin? 


Enzymes catalysing the methylation of the 5-position of cytosine (mC) have essential roles in regulating gene expression 
and maintaining cellular identity. Recently, TET1 was found to hydroxylate the methyl group of mC, converting it to 
5-hydroxymethyl cytosine (hmC). Here we show that TET1 binds throughout the genome of embryonic stem cells, with 
the majority of binding sites located at transcription start sites (TSSs) of CpG-rich promoters and within genes. The hmC 
modification is found in gene bodies and in contrast to mC is also enriched at CpG-rich TSSs. We provide evidence further 
that TET1 has a role in transcriptional repression. TET1 binds a significant proportion of Polycomb group target genes. 
Furthermore, TET] associates and colocalizes with the SIN3A co-repressor complex. We propose that TET] fine-tunes 
transcription, opposes aberrant DNA methylation at CpG-rich sequences and thereby contributes to the regulation of 


DNA methylation fidelity. 


The majority of CpGs in mammalian genomes are methylated. An 
exception to this is CpG islands, which are found in more than 60% of 
all mammalian gene promoters. These are often unmethylated and 
can be either transcriptionally active or inactive depending on other 
factors, including histone modifications and the activity of cell-type- 
specific transcription factors'>. In current models for gene regulation, 
CpG methylation in promoters leads to stable gene silencing, whereas 
the function of intragenic methylation might, like trimethylation of 
histone 3 lysine 36 (H3K36me3), repress the initiation of intragenic 
transcription’. 

DNA methyltransferases are essential for embryogenesis, and the 
methylation pattern of the mammalian genome undergoes major 
changes during development. As an example, global waves of DNA 
demethylation and remethylation take place after fertilization, and 
gene-specific de novo methylation occurs during differentiation of 
embryonic stem (ES) cells®’. Importantly, patterns of DNA methyla- 
tion are perturbed in human diseases such as imprinting disorders 
and cancer®. So far there is very limited knowledge regarding the 
mechanisms leading to DNA hypermethylation of CpG-island pro- 
moters in cancer, and how CpG-islands generally remain unmethy- 
lated in somatic cells. 

Enzymes contributing to DNA demethylation could potentially 
provide a fidelity system for DNA methylation, but such enzymes 
were not known until recently. In a ground-breaking paper, TET1 
was shown to catalyse the hydroxylation of mC’, which has led to the 
proposal of several models for how TET1 and hmC may contribute to 
DNA demethylation and gene regulation. One possibility is that 
hydroxylation of mC by TET1 might interfere with DNMT1 activity, 
leading to a subsequent passive loss of methylation following replica- 
tion. Alternatively, hmC may be converted to cytosine through 
hitherto unknown enzymatic mechanisms. In addition, hydroxyla- 
tion of mC may promote transcriptional de-repression by dissoci- 
ation of mC-binding proteins and/or recruitment of effector 
proteins. The demonstration that hmC is highly abundant in ES cells 
and in neuronal Purkinje cells indicates that this modification is stably 


present in the mammalian genome and that it might be important for 
gene regulation”. 


TET1 binds CpG-rich transcription start sites 


TET is highly expressed in mouse ES cells and is rapidly downregu- 
lated during their differentiation”''. To obtain more information 
regarding the function of TET1, we inhibited TET1 expression in 
mouse ES cells using two different shRNA constructs (Fig. la and 
Supplementary Fig. 1a). The efficient knockdown of Tet! did not lead 
to any change in proliferation rate or expression of NANOG and 
OCT4 (Fig. 1a and Supplementary Fig. 1a, b). These data are in agree- 
ment with a recently published study’, but in contrast to results 
reported by others'’. We also observed inhibition of growth and 
decreased levels of NANOG in mouse ES cells when using the Tet] 
shRNA sequences published in the latter study (Supplementary Fig. 
1c, d). However, as these shRNA sequences do not lead to greater 
knockdown efficiency than the ones we have used (Supplementary 
Fig. 1c), it is possible that shRNA off-target effects could cause the 
observed phenotype. 

We determined the genome-wide location of TET1 by using two 
different antibodies to TET1 (Tetl-N and Tet1-C) for chromatin 
immunoprecipitation followed by DNA sequencing (ChIP-seq). 
These experiments were performed in control or TET1-depleted 
mouse ES cells. The two TET1 antibodies were highly specific as 
shown in the examples provided in Fig. 1b and by the fact that 97- 
99% of the identified TET1 binding sites were not found in the TET1- 
depleted cells (Supplementary Fig. 2a). The majority of TET1 binding 
sites were found in gene bodies, with the highest density around TSSs 
(Fig. 1c). Gene annotation of TET 1 binding sites, using a false discovery 
rate (FDR) < 0.01, showed that TET 1 binds in the vicinity of the TSS of 
6,573 genes (Fig. 1d and Supplementary Table 1), of which all tested so 
far have been independently validated by ChIP followed by real-time 
quantitative PCR (ChIP-qPCR, Supplementary Fig. 2b and data not 
shown). Peak detection analysis using FDR < 0.1 indicates that TET1 
could have up to 9,241 target genes (Supplementary Fig. 3a). Gene 
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Figure 1 | Identification of TET1 target genes. a, Western blot showing 
TET1, OCT4 and NANOG levels for control-transfected (shScr) and TET1- 
depleted (shTet1#3 and shTet1#5) mouse ES cells. b, Examples of TET1 ChIP- 
seq results in control or Tet] knockdown ES cells. ChIP-seq was performed 
using both an anti-N- and anti-C-terminal TET1 antibody (Tet1-N and Tet1- 
C). y-axis of binding profiles denotes number of sequence tag reads. c, Left 
panel, mean distribution of tags across gene bodies for TET1 ChIP-seq in 
control and TET1 knockdown cells. Right panel, diagram illustrating the overall 
distribution of TET1 binding sites into TSS (+1 kb), promoter (—1 to —5 kb), 


Ontology analysis showed that TET1 target genes are involved in a 
variety of basic cellular processes, and in more specific processes such 
as development and differentiation (Supplementary Fig. 3b). The 
majority of the TET1 target genes are associated with high and inter- 
mediate density CpG promoters (HCPs and ICPs, Fig. le), which are 
positive for H3K4me3 (Fig. 1f). The correlation between TET 1 binding 
and high CpG density is also found outside of TSSs (Supplementary 
Fig. 4). Interestingly, TET1 binding does not predict whether a pro- 
moter is active, poised for activation (non-productive) or inactive 
(Fig. 1g). In agreement with this, we found that a significant fraction 
of TET1 was associated with promoters containing the H3K27me3 
Polycomb repressive mark (Fig. 1f). Indeed, independent analysis 
showed a highly significant overlap of genes bound by TET1 and the 
Polycomb group (PcG) protein, SUZ12, in ES cells (Supplementary Fig. 
5a, b). 


hmC is enriched at TSSs and gene bodies 

To gain information regarding a possible function of hmC, we 
generated an affinity-purified polyclonal antibody to hmC that binds 
with high specificity and sensitivity to this mark, as shown by enzyme- 
linked immunosorbent (ELISA) and DNA immunoprecipitation 
(DIP) assays (Supplementary Fig. 6). Genome-wide DIP-seq assays 
were performed using anti-hmC, anti-mC and IgG on genomic DNA 
purified from control or TET 1-depleted ES cells as well as from Dnmt 
triple knockout (TKO) mouse ES cells, lacking Dnmt1, Dnmt3a and 
Dnmt3b"*. We confirmed by ChIP-qPCR that TET1 localizes to its 
target genes in the Dumt TKO cells (Supplementary Fig. 7a). The 
analyses showed that hmC is located as discrete peaks throughout 
the genome (Fig. 2a). Furthermore, the majority of signals obtained 
with the hmC antibody were absent in Dnmt TKO mouse ES cells, 
confirming that generation of hmC requires the pre-existence of mC 
(Fig. 2a). The hmC modification in mouse ES cells is particularly 
enriched within gene bodies as also observed for the mC mark’ 
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exon, intron and intergenic regions. d, Venn diagram illustrating the overlap of 
TET] target genes using anti- TET1-N and -C antibodies. e, f, Histograms 
showing promoter CpG density, divided into high-, intermediate- or low- 
density CpG promoters (HCP, ICP or LCP) as defined in ref. 25 (e) or 
distribution of H3K4me3 (K4) and H3K27me3 (K27)* (f) for all genes or for 
TET1 target genes. g, Overlay of TET1 target genes with active genes (RNA 
polymerase II binding and H3K79me2), non-productive (RNA polymerase II 
binding, no H3K79me2) and inactive (no RNA polymerase II binding or 
H3K79me2)”°””. 


and recently reported for hmC in mouse cerebellum’® (Fig. 2b, c). 
Strikingly, in contrast to the localization of mC, hmC is also signifi- 
cantly enriched at the TSS coinciding with TET 1 (Fig. 2c), indicating 
that a significant fraction of mC is converted to hmC at the TSS. Also, 
the hmC modification is generally not detectable at repetitive elements 
such as intracisternal A particle (IAP) elements and minor satellite 
repeats by DIP-qPCR (Supplementary Fig. 7b), further demonstrating 
that hmC and mC show distinct genomic distributions. 

Gene annotation of hmC positive regions around the TSS (—0.7 
kilobases to +0.3 kb) showed that 2,424 regions are hmC-positive in 
wild-type ES cells compared to Dnmt TKO ES cells. Approximately 
28% of these regions showed a more than twofold reduction in hmC 
signal in the DIP-seq analyses upon downregulation of TET 1 (Fig. 2d) 
and in validation experiments the knockdown of Tet] led to a signifi- 
cant decrease in hmC levels on tested genes (Fig. 2e and data not 
shown). Depending on the used false discovery rate cut-off for 
TET1, between 35% (FDR< 0.01) and 50% (FDR< 0.1) of hmC- 
positive genes are bound by TET] (Fig. 2f). These results are in agree- 
ment with reports showing that Tet! knockdown only causes a partial 
decrease in global hmC levels in mouse ES cells”’*, and imply that, 
although TET1 is important for the generation of hmC, other 
enzymes such as TET2 are also likely to contribute to hmC levels in 
mouse ES cells. 

As for TET1, Gene Ontology analysis of the hmC-positive genes 
showed enrichment for genes involved in basic cellular processes, but 
also in the regulation of development and differentiation (Sup- 
plementary Fig. 7c). Moreover, hmC positivity does not correlate with 
transcriptional activation and surprisingly, most hmC-positive genes 
seem not to be expressed in mouse ES cells (Fig. 2g). 

A significant proportion of the TSSs classified as positive for hmC 
has intermediate or high CpG content (Fig. 2h and Supplementary 
Fig. 4). Genome-wide analyses of the hmC distribution relative to 
CpG content showed that the hmC mark is enriched in regions with 
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Figure 2 | Hydroxymethylcytosine localizes to TSS and gene body. 

a, Examples of hmC DIP-seq results in mouse ES cells. ChIP-seq profiles of 
TET! are included for comparison. b, Diagram illustrating the overall 
distribution of hmC into TSS (+1 kb), promoter (—1 to —5 kb), exon, intron 
and intergenic regions. c, The mean distribution of tags across gene bodies for 
hmC, mC and IgG. d, Almost a third (28%) of hmC positive TSSs showed a 
more than twofold reduction in hmC signal in mouse ES cells depleted of TET1. 
e, DIP-qPCR was performed in control mouse ES cells, Tet! knockdown cells 
(shTet1#3 and shTet1#5), and Dnmt TKO cells as indicated. f, Overlay of genes 


relatively high CpG content compared to mC (Fig. 2i). Whereas only 
15% of hmC-positive TSSs also contain a high mC signal, we find that 
several hmC-positive regions have low levels of mC, implying that the 
two marks often co-exist. Upon Tet] knockdown only a minor global 
increase in mC was observed as evaluated by genome-wide anti-mC 
DIP (Me-DIP) (Supplementary Fig. 8a). However, a few hundred 
genes show modest TSS specific increases in mC levels after Tet 
knockdown (Supplementary Fig. 8b). Gene Ontology analyses for 
these genes showed enrichments for specialized developmental pro- 
cesses (Supplementary Fig. 8c). Interestingly, we found that approxi- 
mately a third of the genes reported to acquire DNA methylation 
during ES cell differentiation*’ are marked by hmC in the ES cell state 
(Supplementary Table 2). Taken together, these results show that hmC 
colocalizes with mC in gene-bodies, and that hmC, in contrast to mC, is 
enriched at TSSs with intermediate to high CpG density, where it may 
contribute to the regulation of DNA methylation patterns. 


TET1 contributes to transcriptional repression 


To understand how TET1 contributes to the regulation of target 
genes, we performed genome-wide expression analyses of mouse ES 
cells expressing two different Tet1 shRNAs or a scrambled shRNA 
(Supplementary Fig. 9a, b and Supplementary Table 3). As shown in 
Fig. 3a and Supplementary Fig. 9c, we observed a significant decrease 


positive for hmC at the TSS with TET] target genes using FDR cut-off values of 
0.01 or 0.1 in the ChIP-seq analysis. g, Overlay of hmC positive genes with 
active genes (RNA polymerase II binding and H3K79me2), non-productive 
(RNA polymerase II binding, no H3K79me2) and inactive (no RNA 
polymerase II binding or H3K79me2)**’. h, Distribution of high-, 
intermediate- or low density CpG promoters (HCP, ICP or LCP)” for all genes 
or hmC-positive genes. i, Plot illustrating the genome-wide correlation of 
TET1, hmC and mC signal intensity (rpm, reads per million) with CpG density. 
All error bars denote s.d., 1 = 3. 


in expression of 556 genes and a significant increase in expression of 
851 genes common to both shRNAs. Of these approximately 700 were 
direct target genes of TET1, and therefore only around 10% of all 
TETI target genes change expression following Tet] knockdown. 
Whereas we expected to observe a significant fraction of the down- 
regulated genes to be direct targets for TET 1, we were surprised to find 
that an even higher fraction of the upregulated genes were associated 
with TET1 (Fig. 3a). To validate these results, we performed qPCR 
analysis of a number of downregulated and upregulated genes 
(Fig. 3b) that were also directly bound by TET1 (Supplementary 
Fig. 2b). Moreover, several of the identified targets show similar 
expression change upon differentiation of mouse ES cells by retinoic 
acid, which leads to decreased levels of TET 1 (Supplementary Fig. 9d). 

To investigate whether the transcriptional effects of TET1 are 
mediated by modulating hmC and mC levels, we performed knock- 
down of Tet1 in Dnmt TKO cells (Supplementary Fig. 10a). We found 
that all the tested transcriptional effects by knockdown of Tet1 were 
similar in Dnmt TKO and normal ES cells (Fig. 3c and Supplemen- 
tary Fig. 10b), indicating that the effects are independent of catalytic 
activity. However, we cannot rule out that TET1-dependent modu- 
lation of hmC and mC might contribute to transcriptional fine-tuning 
at some target genes. Taken together, these results indicate that TET1 
can contribute to transcriptional repression, and to a minor extent 
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Figure 3 | Knockdown of Tet! in ES cells affects transcription. a, Microarray 
analyses were performed in control (shScr) and Tet1 knockdown cells 
(shTet1#4 and shTet1#5) in triplicates. Venn diagram showing overlap 
between TET1-bound genes, and genes up- or downregulated by both shRNAs 
using a cut-off of FDR < 0.05. b, qRT-PCR validation of selected genes. 

c, Genes that were found upregulated or downregulated by Tetl knockdown 
show similar regulation in Dnmt TKO ES cells. All error bars denote s.d., n = 3. 


also transcriptional activation, and that the majority of TET1- 
mediated transcriptional effects are independent of conversion of 
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Figure 4 | TET1 interacts with SIN3A. a, Peptides identified by mass 
spectrometry from anti-Flag and tandem anti-Flag—-HA purification of Flag- 
HA-TET1 and Flag-HA-TET2 stably expressed in 293 cells. The presented 
proteins are all part of the SIN3A complex’’. b, Antibodies specific for TET1, 
SIN3A and c-Myc (negative control) were used for immunoprecipitation (IP) 
and western blot (WB) using nuclear extracts from mouse ES cells. Input 
represents 8%. c, Examples of SIN3A and TET1 ChIP-seq results in mouse ES 
cells. d, Diagram illustrating the overall distribution of SIN3A binding sites into 
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TSS (+1kb), promoter (—1 to —5 kb), exon, intron and intergenic regions. 
e, Mean distribution of tags across gene bodies for SIN3A and TET1. f, Venn 
diagram illustrating a significant (P< 10 *) overlap between TET1 and SIN3A 
target genes (FDR < 0.01). g, ChIP-qPCR in control or Tet1 knockdown cells 
(shTet1#4 and shTet1#5). h, Left panel, western blot illustrating knockdown 
efficiencies of TET1 and SIN3A. Right panel, genes that are upregulated by Tet1 
knockdown are also de-repressed by Sin3a knockdown. All error bars denote 
s.d.,n=3. 
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to recruit SIN3A to the GAL4 DNA binding sites in vivo (Supplemen- 
tary Fig. 11b-e). 

To understand if SIN3A also colocalizes with TET1 on target genes, 
we performed ChIP-seq analysis using two different commercial 
antibodies to SIN3A (Fig. 4c, Supplementary Table 1). This analysis 
showed that SIN3A has a similar binding profile as TET1 (Fig. 4d, e 
and Supplementary Fig. 4), and that TET1 and SIN3A display a sig- 
nificant overlap of target genes (Fig. 4f and Supplementary Fig. 12a). 
Moreover, ChIP experiments showed that TET1 contributes signifi- 
cantly to the recruitment of SIN3A (Fig. 4g), whereas depletion of 
SIN3A had no or modest effect on TET 1 binding to tested target genes 
(Supplementary Fig. 12b). To understand if SIN3A is required for the 
silencing of TET1 repressed genes, we performed gene expression 
analysis of Sin3A knockdown cells (Supplementary Fig. 12c and 
Supplementary Table 4). Here we found an extensive overlap between 
genes with increased expression after Tet1 and Sin3A knockdown that 
are also directly bound by both TET1 and SIN3A (Supplementary Fig. 
12d). This implies that SIN3A is required for the repression of a subset 
of TET] target genes that show increased expression upon TET1 
downregulation (Fig. 4h). 


Discussion 


One of the major findings presented in this paper is that TET1 localizes 
to gene bodies and TSSs of a large number of genes and is particularly 
enriched on genes with high CpG-content. In contrast to the global 
pattern of mC, which is found predominantly in low CpG density 
regions, we found that hmC colocalizes with TET1 at high and inter- 
mediate CpG-content sequences. This finding indicates that TET1 
could have an important role in the metabolism of mC at CpG-rich 
sequences by converting it to hmC. Statistically significant hmC levels 
were not detected around the TSS at the majority of TET 1 target genes. 
It is possible that these genes are not methylated and therefore cannot 
be subsequently hydroxymethylated. Alternatively, it is tempting to 
speculate that low and stochastically placed methylations on these 
CpG-rich genes are passively eliminated through replication in rapidly 
dividing ES cells, following TET1-mediated hydroxylation. If so, the 
generated hmC will most likely not be detected by DIP-analyses 
because it will only occur in few cells in the total cell population. In 
this way the role of TET 1 would be to remove aberrant stochastic DNA 
methylation and contribute to regulating DNA methylation fidelity in 
ES cells. However, we also found a large number of hmC-positive genes 
and, interestingly, many of these become hypermethylated in differ- 
entiated cells, for example, Dazl, Hormad1, Sycp1 and Sycp2 (ref. 2; 
Supplementary Table 2 and data not shown). This suggests a dual 
biological role of TET 1, one in which it removes aberrant DNA methy- 
lation and another that ensures the timely DNA methylation and 
silencing of target genes during differentiation. 

We also provide evidence that TET1 has a role in transcriptional 
repression. Interestingly, downregulation of TET1 in Dnmt TKO ES 
cells leads to upregulation of the same genes as observed in wild-type 
ES cells, indicating that the repressive function of TET1 is independ- 
ent of its catalytic activity. We found that TET1 interacts with the 
SIN3A complex and the extensive colocalization of TET1 and the 
SIN3A co-repressor complex at target genes suggests that SIN3A 
has an important function in TET1-mediated gene repression. 

In summary, our results indicate that TET1 is required for the 
timely expression of genes during development. We propose that 
TET1 by converting mC to hmC serves an important function in 
the regulation of DNA methylation fidelity. In turn this conversion 
may lead to a reduction of DNA methylation at CpG-rich gene regu- 
latory sequences. Thus, loss of function of the TET proteins would 
promote the stochastic hypermethylation of promoters leading to 
deregulation of transcription and differentiation. Interestingly, the 
related TET2 oxygenase is frequently mutated in a variety of haema- 
topoietic neoplasms supporting an important role of conversion of 
mC to hmC in cellular homeostasis'*””. 


ARTICLE 
METHODS SUMMARY 


Cell culture. Low passage (p17) E14TG2a.4 feeder independent ES cells were 
grown on 0.1% gelatin-coated plates in standard ES medium”. Recombinant 
lentiviruses encoding Tet1 or Sin3A shRNA were produced by standard methods. 
Generation of antibodies to murine TET1 and hydroxymethylcytosine. 
Polyclonal antibodies were generated by immunizing rabbits with affinity- 
purified bacterially expressed glutathione-S-transferase (GST) fusion proteins 
GST-Tet1-N (amino acids 1-308), GST-Tet1l-C (amino acids 1739-2039) and 
hydroxymethylcytosine coupled to BSA. The antisera were absorbed on GST or 
mC-ovalbumin, respectively, and subsequently affinity-purified on the antigens. 
ChIP/DIP assays and ChIP/DIP-sequencing. Chromatin immunoprecipitation 
assays (ChIP) were performed and analysed as described previously*'. hme/me- 
DIP assays were performed as described”. For ChIP-seq analysis, the DNA 
obtained from the ChIP assays were adaptor-ligated and amplified using a kit 
from Illumina (IP-102-1001). The amplified DNA from hme/me-DIP or ChIP- 
seq experiments was analysed by Solexa/Illumina high-throughput sequencing. 
The tags were mapped to the mouse genome (assembly mm9) with Bowtie or the 
Solexa Analysis Pipeline. Peak detection and binding analysis were performed 
using the CisGenome program” or MEDIPS**. Chromosomal positions (peaks) 
were annotated to the RefSeq database. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Cell culture. Low passage (p17) E14TG2a.4 feeder independent ES cells were 
grown on 0.1% gelatin-coated plates in Glasgow medium (Sigma) supplemented 
with glutamine (Gibco), nonessential amino acids (Gibco), sodium pyruvate 
(Gibco), 50 1M B-mercaptoethanol, and 15% fetal bovine serum (HyClone) in 
the presence of leukaemia inhibitory factor (LIF). Recombinant lentiviruses 
encoding Tet1 and Sin3A shRNA were produced by standard methods employing 
co-transfection of pLKO.1 shRNA and packaging vectors in 293FT cells. shaRNA- 
transduced ES cells were selected 36h post transduction with 2 1g per ml of 
puromycin for 72h. For Sin3A knockdown, cells were harvested after 48h to 
minimize differentiation. Tet] shRNAs had the following sequences, shTet1#3: 
5'-tgtagaccatcactgttcgac-3’, shTetl#4: 5'-tcatctacttctcacctagtg-3’, shTetl#5: 
5'-agagaacctggtgcatcagat-3’, shTet1#A: 5’-gcagatggccgtgacacaaat-3’ and shTet1#B: 
5-gctcatggagactaggtttge-3’. Sin3A shRNA had the following sequence, shSin3A#73: 
5'-gctgttccgattgtccttaaa-3’. 

Cloning procedures. The open-reading frames (ORF) of mouse Tet1 and Tet2 
were amplified by PCR using cDNA from mouse ES cells or LPS-stimulated 
RAW264.7 mouse macrophages as template, respectively. The amplified frag- 
ments were cloned into the pCR8/GW gateway entry vector (Invitrogen), and the 
DNA sequence was verified by sequencing. Coding errors according to the 
GenBank reference sequences of mouse Tet] and Tet2 were corrected by site- 
directed mutagenesis. To generate expression vectors, the appropriate entry 
clones were transferred into gateway-compatible pCDNA5 TO Flag-HA. 
shRNA constructs targeting Tet1 were constructed in pLKO.1. shRNAs targeting 
murine Sin3A were obtained from Sigma-Aldrich. 

Generation of antibodies to mouse TET1 and hydroxymethylcytosine. 
Polyclonal antibodies were generated by immunizing rabbits with affinity-puri- 
fied bacterially expressed GST-Tet1-N (amino acids 1-308) and GST-Tet1-C 
(amino acids 1739-2039). The antibodies were absorbed on GST-coupled cyano- 
gen bromide-activated Sepharose (GE Healthcare) and subsequently affinity 
purified using Sepharose coupled with GST-Tet1-N or GST-Tet1-C. Antibody 
specificity was confirmed by immunoblotting and immunoprecipitation. To 
generate antibodies against hydroxymethylcytosine, 5-hydroxymethylcytidine 
(Berry & Associates), was covalently coupled to BSA essentially as described”* 
and used for immunization of rabbits. Affinity-purified anti-hydroxymethylcy- 
tosine (hmC) antibodies were produced by column absorption of the rabbit 
antisera on methylcytidine-ovalbumin coupled to cyanogen bromide-activated 
Sepharose followed by column-affinity purification on hydroxymethylcytidine- 
ovalbumin coupled to Sepharose. The antibodies were eluted with 0.1 M glycine- 
HCL, neutralised, dialysed against PBS and stored at —80 °C. The specificity of the 
purified anti-hmC antibodies were analysed by ELISA and in hme-DIP assays. 
For the hme-DIP assays, synthetic 300-base-pair probes incorporating 5, 20 and 
100% hmC or mC, respectively, were amplified by PCR using pCR8/GW (nucleo- 
tides 701-1000) as template. The probes (0.001 ng) were spiked into the hmeC/ 
meC reactions containing 1 1g of sonicated ES DNA. Antibody reactivity with the 
probes was detected by qPCR. 

Purification of TET1 and TET2 complexes. To isolate TET 1 and TET2-containing 
complexes, two-step affinity purification was performed followed by mass spectro- 
metry analysis. Nuclear extracts (250-500 mg, 3 X 10” cells) from Flp-In-T-REx-293 
cell lines expressing Flag~HA-tagged murine TET1 or TET2 were precleared and 
incubated with a 700 ul packed volume of anti-Flag beads (anti-Flag M2-agarose, 
Sigma) overnight at 4 °C with rotation. The beads were collected by centrifugation at 
700g for 5 min and washed six times with 40 resin bed volume of buffer A (20 mM 
Tris-HCl, pH 8.0, 300 mM NaCl, 1.5mM MgCh, 0.2mM EDTA, 10% glycerol, 
0.2mM PMSF, 1mM DTT, lpgml'! aprotinin and 1 pgm leupeptin). The 
beads were transferred into a 10-ml poly-prep chromatography column (Bio-Rad) 
and complexes were then eluted five times after 10 min of incubation using one resin 
bed volume of buffer A supplemented with 0.5 1g jl”! Flag peptide. The eluate was 
subjected to a second round of purification using an antibody against the HA-tag. 
The Flag-IP elute was incubated with 200 pil of a 50% slurry of HA-beads overnight. 
The beads were washed four times with buffer A and eluted with 100 pl buffer A 
supplemented with 1 jg pl’ HA peptide for 2h. The samples were boiled in SDS 
loading buffer and run shortly into a SDS-PAGE gel in order to remove the Flag and 
HA peptide and other contaminations. A gel slice containing the purified proteins 
was isolated for mass spectrometry analysis. 

ChIP/DIP assays and ChIP/DIP-seq. Chromatin immunoprecipitation assays 
(ChIP) were performed and analysed as previously described*!. The antibodies 
used were anti-mSin3A (Abcam AB3479, Santa Cruz sc-994X) and the antibodies 
to TET1 described above. ES cell DNA was sonicated to an average size between 
300 and 600 bp. Adaptor-ligated libraries for hmC or mC DNA immunopreci- 
pitations assays (hm-DIP/me-DIP) were constructed using the NEBNext DNA 
Sample Prep Master Mix, NEB combined with Illumina adaptors. hme/me-DIP 
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assays were performed as described” using 11g of denatured sonicated or 
adaptor-ligated DNA in 100 pl of binding buffer and 0.1-4 1g of affinity-purified 
rabbit hmC antibody or monoclonal mC antibody (Eurogentec BI- MECY-0500). 
The samples were incubated for four hours at 4 °C before addition of 10 kl of anti- 
rabbit/mouse Dynabeads (Invitrogen). After 2 h of incubation, the samples were 
washed four times and bound DNA was eluted by incubation for one hour at 
55°C in 100 pl of 50 mM Tris-HCl, 10 mM EDTA, 0.5% SDS and 20 ig protei- 
nase K. The DNA was purified using a QIAquick PCR purification kit (Qiagen) 
and amplified by 16 cycles of PCR. For the MeCAP (methylated DNA capture by 
affinity purification) experiments, the MethylCap kit (Diagenode) was used 
according to manufacturer’s instructions. For ChIP-seq analysis, the DNA 
obtained from the ChIP assays were adaptor-ligated and amplified using a kit 
from Illumina (IP-102-1001). The amplified DNA from hme/me-DIP or ChIP- 
seq experiments was analysed by Solexa/Illumina high-throughput sequencing. 
After prefiltering the raw data by removing sequenced adapters and low quality 
reads, the tags were mapped to the mouse genome (assembly mm9) with the 
Bowtie alignment tool. To avoid any PCR bias we allowed only one read per 
chromosomal position (unless otherwise specified) thus eliminating spurious 
spikes. Peak detection were performed in the CisGenome program” at an FDR 
cut-off value <0.1 or <0.01 as indicated in the text. IgG was used as control for 
normalization. Venn diagram analysis was performed with Galaxy browser 
(www.galaxy.psu.edu). Most standard peak detection programs are typically opti- 
mized for transcription binding site data and anticipate a defined narrow bell- 
shaped density profile. However, for epigenetics data, such as mC and hmC, the 
peaks tends to be broad and low-intensity, thus requiring a different peak detec- 
tion program. We used the MEDIPS tool” (bin size = 50, fragment length = 250, 
frame size = 500, step = 250) to detect significant enrichment of signal (reads per 
million, rpm) relative to a control (Dnmt TKO DIP) and an input (IgG DIP) 
sample at an FDR cut-off value <0.1 and a minimal enrichment of ratio >5. For 
the MEDIPS analysis the reads were not limited to one read per chromosomal 
position, and the total length of the mapped reads were extended in the direction 
of the 3'-end to a total length of 250 bases, which was our estimate of the mean 
fragment length. Chromosomal positions (peaks) were annotated to the RefSeq 
database (mm9) using the UCSC “refFlat” table”. Genes not uniquely mapped to 
the genome were excluded. Signal vs CpG plot: for the signal vs CpG plots the 
MEDIPS calculated rpm and CpG (CpG values from transformed “coupling” 
factors) values were used. To avoid redundancy only the longest transcript variant 
of each gene was used to define chromosomal locations of promoter, TSS, exons 
and introns. For each bin (non-overlapping) MEDIPS determines the number of 
overlapping reads and the CpG content. For a specified region of interest (ROI), 
for example, an exon, the mean rpm and CpG content of the bins within the range 
was calculated. The distribution of CpG content within the different genomic 
categories are distinctly different for example, with the TSS region showing the 
known bimodal distribution. To depict the rpm as a function of CpG-content, the 
mean rpm values were stratified according to CpG-content (1% resolution) and 
the mean of the mean rpms within each stratus calculated. Due to variability in 
the size of ROIs (except for the genome-wide analysis), the plots for the different 
genomic categories are not directly comparable. Wiggle-based plots: to avoid 
redundancy, the longest transcript variant of each gene in the RefSeq database 
was used as reference. In total the chromosomal mappings of 21,513 unique genes 
were used. The filtered alignment files were converted to bigWig files from which 
the tag count information was extracted using unix tools from the UCSC website. 
Gene Body plots: 40 non-overlapping windows with average tag number per base 
were calculated for each gene. 10 kb upstream of TSS and 10 kb downstream of 
transcription end site (TES) was divided into of windows of size 1 kb. Between 
TSS and TES each gene were divided into 20 windows of equal (gene-specific) size 
and the average counts was calculated. All statistics and plotting were done using 
the statistical program R. 

mRNA expression analysis. For expression analysis, total RNA was purified 
from murine embryonic stem cells using RNeasy (Qiagen). The RNA was reverse 
transcribed using TaqMan reverse transcription reagents from ABI, according to 
the manufacturer’s instructions. For RNA quantification, reversed-transcribed 
total RNA was analysed by real-time PCR using SYBR Green PCR Master Mix 
(Fermentas) and an ABI prism 7700 Sequence Detection system. All reactions 
were analysed in triplicates. Primer sequences are listed in Supplementary Figure 
13 and Supplementary Fig. 14. For microarray analysis, RNA was extracted with 
the RNeasy Plus RNA extraction kit (Qiagen). RNA was hybridized on mouse 
Gene 1.0 ST arrays by the RH Microarray Center at Rigshospitalet, Copenhagen, 
following Affymetrix procedures and analysis. Gene expression analyses of RNA 
from shScr, shTet1#4, shTet1#5 and shSin3A#73 cells were performed in tripli- 
cates and in the subsequent data analysis FDR values <0.05 was used. 
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Unbound or distant planetary mass population 
detected by gravitational microlensing 


The Microlensing Observations in Astrophysics (MOA) Collaboration & The Optical Gravitational Lensing Experiment (OGLE) 


Collaboration* 


Since 1995, more than 500 exoplanets have been detected using 
different techniques'’”, of which 12 were detected with gravita- 
tional microlensing**. Most of these are gravitationally bound to 
their host stars. There is some evidence of free-floating planetary- 
mass objects in young star-forming regions” *, but these objects are 
limited to massive objects of 3 to 15 Jupiter masses with large 
uncertainties in photometric mass estimates and their abundance. 
Here, we report the discovery of a population of unbound or dis- 
tant Jupiter-mass objects, which are almost twice (1.81)%) as 
common as main-sequence stars, based on two years of gravita- 
tional microlensing survey observations towards the Galactic 
Bulge. These planetary-mass objects have no host stars that can 
be detected within about ten astronomical units by gravitational 
microlensing. However, a comparison with constraints from direct 
imaging’ suggests that most of these planetary-mass objects are not 
bound to any host star. An abrupt change in the mass function at 
about one Jupiter mass favours the idea that their formation pro- 
cess is different from that of stars and brown dwarfs. They may 
have formed in proto-planetary disks and subsequently scattered 
into unbound or very distant orbits. 

In a gravitational microlensing event, a foreground lens object is 
detected as a result of the characteristic magnification of a background 
source star as it passes behind the gravitational field of the lens'®. The 
lens object is detected by means of its mass and not its luminosity. The 
duration of the magnification is parameterized by the Einstein radius 
crossing time, tg~\/M/My days, where M;=9.5%X10 *Mo is 
Jupiter’s mass (Mo is the mass of the Sun). Thus, microlensing can 
detect faint planetary mass objects—which are either unbound to any 
host star'!”” or are in very wide orbits'’—as short-timescale events 
with ft, < 2 days. Although tz also depends on the distance and trans- 
verse velocity of the lens (see Supplementary Information), the 
observed t; distribution can be used as a statistical probe of the mass 
function of the lens objects because the spatial and velocity distribu- 
tions in the Galactic disk and bulge are reasonably well known. 

The Microlensing Observations in Astrophysics (MOA) and 
Optical Gravitational Lensing Experiment (OGLE)"* groups both con- 
duct microlensing surveys towards the Galactic Bulge. The second 
phase of MOA, called MOACIL, carries out a high-cadence photometric 
survey of 50 million stars in bulge fields with a cadence of 10-50 min. 
This strategy enables MOA to detect very short events with tz < 2 days, 
which were quite rare in previous microlensing surveys that had lower 
cadences'?!41617, 

In this analysis of the 2006-2007 MOA-II data set, light curves of 
genuine microlensing events were distinguished from intrinsic vari- 
ables and artefacts by several empirical criteria, which have been 
developed in previous microlensing surveys'*””. The light curves must 
have a single brightening episode of more than three consecutive sig- 
nificant data points with a constant baseline, and should be well fitted 
by a theoretical microlensing model? with a well constrained tg (see the 
Supplementary Information). 


Although there are a thousand microlensing events in this sample, 
only 474 well characterized events passed our strict selection criteria. 
Ten of these events have tg < 2 days (see Fig. 1 and Table 1)—thus 
indicating planetary-mass lenses. We have confirmed that this event 
sample has no significant contamination by possible background 
effects including: (1) cosmic-ray hits, (2) fast-moving objects, (3) cata- 
clysmic variables, (4) background supernovae, (5) binary microlensing 
events and (6) microlensing by high-velocity stars and Galactic-halo 
stellar remnants. For example, effect (1) is excluded because cosmic 
rays never hit the same place in four consecutive images, microlensing 
model fits for effects (2) to (5) produce a high 7’ and unphysical values 
of parameters, and effect (6) is excluded by proper-motion and radial- 
velocity observations (see Supplementary Information). After the 
MOA event selection was complete, the MOA group requested addi- 
tional independent light-curve data of these short events from the 
OGLE group. Seven of the ten events with t;<2days were also 
observed by OGLE-III”*, and none of them have any other brightening 
in the eight-year OGLE-III light curves. For six of these seven events, 
there are OGLE data obtained during the lensing event that confirm 
the predictions of the MOA microlensing models. Thus, the OGLE 
data confirms the microlensing interpretation of these short events. 

The detection efficiencies for this analysis were estimated with a 
Monte Carlo simulation’*’”. We simulated 20 million artificial events 
to evaluate the detection efficiency as a function of tp, yielding 
ete) ~ 1%, 3%, 5%, 10%, 15% and 10% at tg = 0.3, 1, 2, 10, 30 and 
100 days, respectively. The details of the efficiency calculations and 
consistency tests of the selected event distribution are discussed in 
the Supplementary Information. 

The observed ty distribution is compared to two mass function mod- 
els in Fig. 2. A model ty distribution, ®(tg), can be calculated for an 
assumed mass function with a standard Galactic mass density and 
velocity model'*!””°. We consider two mass functions. The first is a 
broken power law*°”! dN/dM = M %, with power indices of « = 2.0 
for 0.7 S M/Mo <1, %& = 1.3 for 0.08 = M/Mo 0.7 and a; as a fit- 
ting parameter for the brown dwarf regime 0.01 = M/M. = 0.08. The 
second is a log-normal function” dN/dlogM = exp[(logM — logM.)”/ 
(20*)] with a mean mass M. and a width in logM of o, for 
0=M/Mo = 1.0. For both mass functions, we assume that stars that 
were initially above 1Mo have evolved into stellar remnants—white 
dwarfs, neutron stars or black holes, depending on their initial masses** 
(see Fig. 2 and Supplementary Table 3). 

The mass functions were constrained by a likelihood analysis, with 
the likelihood function given by the product of the model probability 
(tz) of finding N.p; = 474 events with each of the observed tg, ;, that 
is: L= TEN G(tg, ;)e(te, ;)- 

We evaluated the likelihood distributions for these mass functions 
both with and without the t; < 2 events, but the inclusion of the events 
with fg <2days makes little difference. The results are shown in 
Supplementary Table 3 and Supplementary Figs 6 and 7. Figure 2 
indicates that both models match the data well for tg = 2 days, but at 


*Lists of participants and their affiliations appear at the end of the paper. 
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Figure 1 | Light curves of event MOA-ip-3 and event MOA-ip-10. These 
have the highest signal-to-noise ratio of the ten microlensing events with 

tp < 2 days (see Supplementary Fig. 1 for the others). MOA data are in black 
and OGLE data are in red, with error bars indicating the s.e.m. a, MOA-ip-3 
light curve; b, MOA-ip-10 light curve. The green lines represent the best-fit 
microlensing model light curves. For each event, the upper panel shows the full 
two-year light curve, the middle panel is a close-up of the light-curve peak, and 
the bottom panel shows the residuals from the best-fit model in units of the 
magnification, AA. up indicates the source-lens impact parameter in units of 
the Einstein radius. The second phase of MOA, MOA-IL, carried out a very- 
high-cadence photometric survey of 50 million stars in 22 bulge fields (of 


tz < 2 days, the ten observed events are well above the model predic- 
tions. The power-law and log-normal models predict 1.5 and 2.5 
events with tz < 2 days, respectively, and the corresponding Poisson 
probabilities for the ten observed events are 4X 10 ° and 3X10 *. 
Thus, we feel confident in adding a new planetary-mass population. 
For simplicity, we chose a planetary-mass function model with a 
6-function in mass Mp, anda fraction of all objects in the planetary-mass 
population ®p;. The values of (Mp; /M >, ®p,) derived from the likelihood 
analysisare (1.17); x 107 *,0.49+9'13)and(0.831):20 x 107 *,0.46 +912) 
for the power-law and log-normal models, respectively. The contours 
are shown in Fig. 3. Both models for ®(t_) provide good fits to the entire 
ty distribution, as shown in Fig. 2. The power-law and log-normal 
models imply 1.9*}3 and 1.8*;% times as many unbound or distant 
Jupiter-mass objects as main-sequence stars in the mass range 
0.08 < M/M. < 1.0, respectively. These planetary mass objects are at 
least 1.5 times as frequent as planets with host stars (see Supplementary 
Information). We tested a third mass function that has fewer massive 


Table 1 | Microlensing parameters of short-timescale events 


2.2 deg” each) with a 1.8-m telescope at Mt John Observatory in New Zealand. 
MOA detects 500-600 microlensing events during eight months observation 
every year. In 2006-2007, MOA observed two central bulge fields every 10 min, 
and other bulge fields with a 50 min cadence, which resulted in about 8,250 and 
1,660-2,980 images, respectively. This strategy enabled MOA to detect very 
short events with tz, < 2 days. Since 2002, the OGLE-III survey has monitored 
the bulge with the 1.3-m Warsaw telescope at Las Campanas Observatory, 
Chile, with a smaller field-of-view but better astronomical seeing than MOA. 
The OGLE-III observing cadence was 1-2 observations per night, but the 
OGLE photometry is usually more precise and fills gaps in the MOA light 
curves owing to the difference in longitude. 


stars and brown dwarfs”, and found that the resultant planetary-mass 
function parameters are consistent with the above values (see the 
Supplementary Information). 

The lenses for these short events could be either free-floating planets or 
planets with wide separations of more than about ten astronomical units 
(AU) from their host stars, for which we cannot detect the host star in the 
light curves”. However, direct imaging, with adaptive optics, of planets 
orbiting young stars places upper limits on planets at wide separations. 
The Gemini Planet Imager has set upper limits’ on the number of stars 
with Jupiter-mass planets at semi-major axes of 10-500 au. From these 
results, we estimate that <0.4 of the 1.8 planetary-mass objects per star 
are likely to be bound to stars at orbital separations of <500 AU (see 
Supplementary Information section 8). Hence, more than 75% of these 
planetary mass objects are probably unbound to stars if their typical mass 
is a Jupiter-mass or more. 

Because the 5-function planetary models are not likely to be realistic, 
we also tested a fourth mass function that is identical to the first, 


and MOA-2006-BLG-098 by the MOA real-time alert system (http://www.massey.ac.nz/~iabond/moa). 
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D Field Right ascension, « (2000) Declination, (2000) Ni, to (JD’) te (days) Uo (Re) Amax I, (mag) — dmin (Re*) 
MOA-ip-1 gbl-4 17h46min24.506s —34° 30’ 36.82" 9 3883.24171 0.73 + 0.08 0.028+0.003 35.6 17 7.0 
MOA-ip-2 gb4-3.  17h52min34.143s —30°54' 14.25" 28 4223.88851 0.49+0.10 0.400 + 0.212 2.6 17.9 3:3 
MOA-ip-3 gb5-7—  17h54min58.325s —29° 38' 20.68" 170 4295.34720 1.88 + 0.12 0.911 + 0.096 14 172 3:6 
MOA-ip-4 gb5-8 =17h54min24.543s —29° 13' 29.39" 81 3961.38803 1.48+0.12 0.271 + 0.061 3.8 19.2 Sul 
MOA-ip-5 gb9-2. 17h57min17.008s =—29°:02'33:59"" 69 4169.60907 1.62 + 0.69 0.126 + 0.159 8.0 19.2 24 
MOA-ip-6 gb9-4 = 17h59min19.977s —29° 31' 24.70" 27 4189.49214 1.78 + 0.24 0.499 + 0.122 22. 18.3 48 
MOA-ip-7 gb9-5 17h57min36.678s —29°59' 40.52" 51 4370.69496 1.82 + 0.87 0.143 +0.125 7.0 19.4 5.2 
MOA-ip-8 gb9-5 = 17h59min34.877s —30° 04’ 24.04"’ 47 4013.14052 1.36+0.15 0.103 + 0.016 9.8 18.8 48 
MOA-ip-9 gb10-5 17h57min52.952s —28° 16’ 56.66"’ 16 3910.81772 0.96 + 0.21 0.163 + 0.058 6.2 19.5 3.4 
MOA-ip-10 gb11-9 18h09min00.076s —32°18'39.91" 21 3932.99205 1.19 + 0.04 0.032+0.001 308 18.8 15.0 

N,, indicates the number of data points within to + te, and to, te, Amax and /, indicate the time of peak magnification, the Einstein radius crossing time, the maximum magnification, and the source star magnitude of 
the best fit models of the MOA data, respectively. JD’ = JD — 2,450,000. uo and dinin indicate the source-lens impact parameter and minimum host star separation in units of the Einstein radii of the planetary mass 
lens, Re, and possible host star, Re», respectively. The errors in te and ug represent 1c limits, din indicates 2¢ limits. MOA-ip-2, MOA-ip-3 and MOA-ip-10 were alerted as MOA-2007-BLG-144, MOA-2007-BLG-309 
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Figure 2 | Observed and theoretical distributions of the event timescale, tg. 
The black histogram represents the number N of observed 474 microlensing 
events in each bin with error bars indicating the s.e.m. The red and blue lines 
indicate the best-fit models with the power-law and log-normal mass functions, 
respectively. For both mass functions, we assume that stars that were initially 
above 1M. have evolved into stellar remnants—white dwarfs, neutron stars or 
black holes, depending on their initial masses. The number of remnants is 
determined by extending the upper main-sequence power law a = 2.0 to 
100M 5, and the final remnant mass distributions are given by Gaussians” (see 
Supplementary Table 3). Each model is multiplied by the detection efficiencies. In 
each model, dashed lines indicate models for stellar, stellar remnant and brown 
dwarf populations, and the dotted lines represent the planetary-mass population. 
Solid lines are the sums of these populations, and both models fit the data well. 
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Figure 3 | Likelihood contours for the planetary-mass function parameters. 
®p, indicates the fraction of all objects in the planetary-mass population, not 
including the brown dwarfs that have planetary mass in the tail of the log-normal 
mass function. Mp, represents their masses. The two sets of contours indicate the 
68% and 95% confidence levels. The red and blue curves indicate the power-law 
and log-normal mass functions, respectively, and crosses indicate the maximum- 
likelihood points. The top-axis scale is in Jupiter masses and the bottom-axis scale 
is in solar masses. For the power-law model, the likelihoods are evaluated in the 
(a3, Mpr, ®p,) space and projected into the (Mpz, ®p;) plane. The M, = 0.12 and 
o, = 0.76 parameters are fixed for the log-normal model. The median and 68% 
confidence intervals of (Mp,/Mo, py) are (1.1 ne x 10~°,0.49* 913) and 
(0.83 70-26 x 10~?,0.46 +17) for the power-law and log-normal models, 
respectively. The results for two models are consistent with each other. The 
power-law and log-normal models imply 1.9*};3 and 1.8%} times as many 
unbound or distant Jupiter-mass objects as the main-sequence stars. 3 is 
consistent with the values derived without planetary population, indicating that 
brown dwarfs are 0.7 + 0.3 times as common as main-sequence stars. The 
numerical values of the models are summarized in Supplementary Table 3. 
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broken-power-law model except for having a power-law form in the 
planetary mass regime M<0.01Mo. This yields a planetary-mass 
index of &py, = ee which is much steeper than the brown dwarf 
slope of %3 = 0.491)55, indicating that they are distinct populations 
(see Supplementary Information). 

Planet-formation theories predict that dynamical instabilities in 
planetary systems with multiple giant planets could scatter many of 
these planets into unbound orbits”, as well as some into large separa- 
tions’’. Recent observations also indicate that planet—planet scattering 
plays an important part in moving giant planets into short-period 
orbits**”’. The planetary-mass population that we have identified here 
may have formed in protoplanetary disks at much smaller separations 
and then been scattered into unbound or very distant orbits. 
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Observation of the antimatter helium-4 nucleus 


The STAR Collaboration* 


High-energy nuclear collisions create an energy density similar to 
that of the Universe microseconds after the Big Bang’; in both 
cases, matter and antimatter are formed with comparable abund- 
ance. However, the relatively short-lived expansion in nuclear colli- 
sions allows antimatter to decouple quickly from matter, and avoid 
annihilation. Thus, a high-energy accelerator of heavy nuclei pro- 
vides an efficient means of producing and studying antimatter. The 
antimatter helium-4 nucleus (*He), also known as the anti-a (@), 
consists of two antiprotons and two antineutrons (baryon number 
B= -4). It has not been observed previously, although the 
a-particle was identified a century ago by Rutherford and is present 
in cosmic radiation at the ten per cent level”. Antimatter nuclei with 
B< —1 have been observed only as rare products of interactions at 
particle accelerators, where the rate of antinucleus production in 
high-energy collisions decreases by a factor of about 1,000 with each 
additional antinucleon*>. Here we report the observation of ‘He, 
the heaviest observed antinucleus to date. In total, 18 “He counts 
were detected at the STAR experiment at the Relativistic Heavy Ion 
Collider (RHIC; ref. 6) in 10° recorded gold-on-gold (Au+ Au) col- 
lisions at centre-of-mass energies of 200 GeV and 62 GeV per nuc- 
leon-nucleon pair. The yield is consistent with expectations from 
thermodynamic’ and coalescent nucleosynthesis® models, provid- 
ing an indication of the production rate of even heavier antimatter 
nuclei and a benchmark for possible future observations of *He in 
cosmic radiation. 

In 1928, the existence of negative energy states of electrons was 
predicted’ on the basis of the application of symmetry principles to 
quantum mechanics, but these states were only recognised to be 
antimatter after the discovery’ of the positron (the antielectron) in 
cosmic radiation four years later. The predicted antiprotons’’ and 
antineutrons’* were observed in 1955, followed by antideuterons (d), 
antitritons (*H), and antihelium-3 (He) during the following two 
decades’*"'®. Recent accelerator and detector advances led to the first 
production of antihydrogen’” atoms in 1995 and the discovery of 
strange antimatter, the antihypertriton (3H), in 2010 at RHIC at the 
Brookhaven National Laboratory (ref. 18 and references therein). 

Collisions of relativistic heavy nuclei create suitable conditions for 
producing antinuclei, because large amounts of energy are deposited 
into a more extended volume” than that achieved in elementary particle 
collisions. These nuclear interactions briefly (~10°**s) produce hot 
and dense matter containing roughly equal numbers of quarks and 
antiquarks”’, often interpreted as quark gluon plasma’. In contrast to 
the Big Bang, nuclear collisions produce negligible gravitational attrac- 
tion and allow the plasma to expand rapidly. The hot and dense matter 
cools down and undergoes a transition into a hadron gas, producing 
nucleons and their antiparticles. The production of light antinuclei can 
be modelled successfully by macroscopic thermodynamics’, which 
assumes energy equipartition, or by a microscopic coalescence pro- 
cess*?*, which assumes uncorrelated probabilities for antinucleons close 
in position and momentum to become bound. The high temperature 
and high antibaryon density of relativistic heavy ion collisions provide a 
favourable environment for both production mechanisms. 

The central detector used in our measurements of antimatter, the 
Time Projection Chamber (TPC)”* of the STAR experiment (Solenoidal 


Tracker At RHIC), is situated in a solenoidal magnetic field and is used 
for three-dimensional imaging of the ionization trail left along the 
path of charged particles (Fig. 1). In addition to the momentum pro- 
vided by the track curvature in the magnetic field, the detection of “He 
particles relies on two key measurements: the mean energy loss per 
unit track length, (dE/dx), in the TPC gas, which helps distinguish 
particles with different masses or charges, and the time of flight of 
particles arriving at the time of flight barrel (TOF)” surrounding the 
TPC. In general, time of flight provides particle identification in a 
higher momentum range than (dE/dx). The (dE/dx) resolution is 
7.5% and the timing resolution for the TOF system is 95 ps within a 
7-75 ns window. 

The trigger system at STAR selects collisions of interest for analysis. 
The minimum-bias trigger selects all particle-producing collisions, 
regardless of the extent of overlap of the incident nuclei. A central 
trigger (CENT) preferentially selects head-on collisions, rejecting 
about 90% of the events acquired using the minimum-bias trigger. 
The sample of 10’ Au+Au collisions used in this search is selected 
on the basis of the minimum-bias trigger, on CENT, and on various 
specialized triggers. Preferential selection of events containing tracks 
with charge Ze = +2e (where e is the electron charge and Z is the 
particle charge in units of e) was implemented using a High-Level 
Trigger (HLT) for data acquired in 2010. The HLT used computational 
resources at STAR to perform a real-time fast track reconstruction to 
tag events that had at least one track with a (dE/dx) value that is larger 
than a threshold set to three standard deviations below the theoret- 
ically expected value”® for *He at the same magnetic rigidity. The HLT 
successfully identified 70% of the events where a *He track was present 
while selecting only 0.4% of the events for express analyses. 

Figure 2 shows (dE/dx) versus the magnitude of magnetic rigidity, 
p/|Z|, where p is momentum. A distinct band of positive particles 


100 cm 


Figure 1 | A three-dimensional rendering of the STAR TPC surrounded by 
the TOF barrel shown as the outermost cylinder. Tracks from an event which 
contains a “He are shown, with the *He track highlighted in bold red. 


*Lists of participants and their affiliations appear at the end of the paper. 
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Figure 2 | (dE/dx) versus p/|Z|. a, For negatively charged particles (grey and 
blue dots); b, for positively charged particles (grey and orange dots). The black 
curves show the expected values for each species. The lower edges of the bands 
of coloured dots correspond to the online calculation by the HLT of 30 below 


centred around the expected value”> for *He particles is shown in 
Fig. 2b and indicates that the detector is well-calibrated. In Fig. 2a, 
where p/|Z| is less than 1.4 GeV/c (where c is the velocity of light), four 
negative particles are particularly well separated from the *He band 
and are located within the expected band for 4He. Above 1.75 GeV/c, 
(dE/dx) values of *He and *He merge and the TOF system is needed to 
separate these two species. 

Figure 3a and b shows the (dE/dx) (in units of multiples of Gaz/ao 
Ngapja.) Versus calculated mass m= (p/c)\/ (2c wi L? —1), where agyax 
is the r.m.s. width of the (dE/dx) distribution for *He or “He, and t and L 
are the time of flight and path length, respectively. Negatively and posi- 
tively charged particles are shown in Fig. 3a and b, respectively. In both 
panels, majority species are "He and *He. In Fig. 3b, the “He particles 
cluster around n,,,,. ,, =0 and mass 3.73 GeV/c’, the appropriate mass 
for *He. A similar but smaller cluster of particles can be found in Fig. 3a 
for ‘He. In Fig. 3c we show the projection onto the mass axis for particles 
in Fig. 3a and b with ng... of —2 to 3. There is clear separation between 
>He and *He mass peaks. Eighteen counts for *He are observed. Of 
those, sixteen are from collisions recorded in 2010. Two counts”® iden- 
tified by (dE/dx) alone from data recorded in 2007 are not included in 
this figure, because the STAR TOF was not installed at that time. 

To evaluate the background in 4He due to 7He contamination, we 
simulate the >He mass distribution with momenta and path lengths, as 
well as the expected time of flight from *He particles with timing 
resolution derived from the same data sample. The contamination 
from misidentifying *He as *He is estimated by integrating over the 
region of the “He selection. We estimate that the background contri- 
butes 1.4 (0.05) counts of the 15 (1) total counts from Au+ Au colli- 
sions at 200 (62) GeV recorded in 2010. Therefore, the probability of 
misidentification is at the 10” '' level. 

The observed counts are used to calculate the antimatter yield with 
appropriate normalization (the differential invariant yield) in order to 
compare to the theoretical expectation. Detector acceptance, efficiency, 
and antimatter annihilation with the detector material are taken into 
account when computing yields. Various uncertainties related to track- 
ing in the TPC, matching in the TOF, and triggering _ the HLT are 
cancelled when the yield ratios of “He/*He and “He/® He are calcu- 
lated. The ratios are *He/*He = (3.0 + 1.3(stat) 03(SYS)) x 10-3 
and *He/*He = (3.2 + 2. 3(stat) 7? 02(S¥8)) x 107? for central 
Au+Au collisions at 200 GeV (where ‘stat’ and ‘sys’ indicate the statis- 
tical and systematic errors). The ratios were obtained in two windows. 
The first was 40° < @< 140°, where the polar angle, 0, is the angle 
between the particles momentum vector and the beam axis (these 0 
limits correspond to limits of —1 to 1 in a related quantity, pseudo- 
rapidity). The second was a pr per baryon window centred at py/ 
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the (dE/dx) band centre” for *He. The grey bands correspond to charged 
particles which lie far from the region of particular focus in the present study, 
and which were not selected by the HLT. The bands marked p, p, K and n 
correspond to protons, antiprotons, kaons and pions, respectively. 


|B| = 0.875 GeV/c with a width of 0.25 GeV/c, where p+ is the projection 
of the momentum vector on the plane that is transverse to the beam 
axis. Ratios calculated by a Blastwave model”’ for the py/|B| window 
mentioned above and for the whole range of p;/|B| differ by only 1%. 
The differential yields (see legend to Fig. 4) for “He (*He) are obtained 
by multiplying the ratio of *He/*He (*He / 3He) with the *He (*He) 


102 


Counts 


2 25 3 3.5 4 45 
Mass (GeV/c?) 


Figure 3 | Isotope identification based on energy loss and mass calculated 
from momentum per charge and time of flight. a, b, The (dE/dx) in units of 
multiples of Cgzjae Nog aes of negatively charged particles (a) and positively 
charged particles (b) as a function of mass measured by the TOF system. The 
masses of *He (?He) and “He (*He) are indicated by the black vertical dashed 
lines at 2.81 GeV/c” and 3.73 GeV/c’, respectively. The light blue horizontal 
dashed line marks the position of zero deviation from the expected value of (dE/ 
dx) (Mee, = 0) for “He (“He). The rectangular boxes highlight areas for “He 
(*He) selections: —2< Noaejae <3 and 3.35 GeV/c” < mass < 4.04 GeV/c? 
(corresponding to a +30 window in mass). ¢, A projection of entries in a and 
b onto the mass axis for particles in the window of —2 < Gggjax < 3. The 
combined measurements of energy loss and the time of flight allow a clean 
identification to be made in a sample of 0.5 X 10’* tracks from 10’ Aut Au 
collisions. 
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yields**. The systematic uncertainties consist of background (—6% for 
both ratios), feed-down from (anti-)hypertritons (18% for both *He and 
3He), knockouts from beam-material interactions (—5% for the ratio 
“He/*He only) and absorption (4% for the ratio “He vi 3He only). 
Figure 4 shows the exponential’ invariant yields versus baryon number 
in 200 GeV central Au+ Au collisions. Empirically, the production rate 
reduces by a factor of 1.675% x 10°(1.1493 x 10%) for each addi- 
tional antinucleon (nucleon) added to the antinucleus (nucleus). This 
general trend is expected from coalescent nucleosynthesis models*, 
originally developed to describe production of antideuterons”, as well 
as from thermodynamic models’. 

In a microscopic picture, a light nucleus emerging from a relativistic 
heavy-ion collision is produced during the last stage of the collision 
process. The quantum wavefunctions of the constituent nucleons, if close 
enough in momentum and coordinate space, will overlap to produce the 
nucleus. The production rate for a nucleus with baryon number B is 
proportional to the nucleon density in momentum and coordinate space, 
raised to the power of |B|, and therefore exhibits exponential behaviour 
as a function of B. Alternatively, in a thermodynamic model, a nucleus is 
regarded as an object with energy E ~ |B|my, where my is the nucleon 
mass, and the production rate is determined by the Boltzmann factor 
exp(— E/T), where T is the temperature*”. This model also produces an 
exponential yield. A more rigorous calculation® can provide a good fit to 
the available particle yields, and predicts the ratios integrated over pr to 
be *He/*He = 3.1 X 10 2 and “He /*He= 2.4 x 107°, consistent with 
our measurements. The considerations outlined above offer a good 
estimate for the production rate of even heavier antinuclei. For example, 
the yield of the stable antimatter nucleus next in line (B = —6) is 
predicted to be down by a factor of 2.6 X 10° compared to “He and 
is beyond the reach of current accelerator technology. 

A potentially more copious production mechanism for heavier 
antimatter is by the direct excitation of complex nuclear structures 
from the vacuum”. A deviation from the usual rate reduction with 
increasing mass would be an indication of a radically new production 
mechanism’. On the other hand, going beyond nuclear physics, the 
sensitivity of current and planned space-based charged particle detec- 
tors is below what would be needed to observe antihelium produced by 
nuclear interactions in the cosmos, and consequently, any observation 
of antihelium or even heavier antinuclei in space would indicate the 
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Figure 4 | Differential invariant yields as a function of baryon number, B. 
The differential invariant yields d°N/(2x prdprdy) were evaluated at p/ 

|B | = 0.875 GeV/c, in central 200 GeV Au+ Au collisions, where N is counts per 
event and y is rapidity. Yields for (anti)tritons (7H and 3H) lie close to the 
positions for *He and *He, but are not included here because of poorer 
identification of (anti)tritons. The lines represent fits with the exponential 
formula xe"?! for positive (solid orange line) and negative (dashed blue line) 
particles separately, where r is the production reduction factor. Analysis details 
of yields other than *He (He) have been presented elsewhere*”* and are plotted 
here as open symbols. The plotted error bars show standard statistical errors 
only. Systematic errors are smaller than the symbol size, and are not plotted. 
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existence of a large amount of antimatter elsewhere in the Universe. In 
particular, finding “He in the cosmos is one of the major motivations 
for space detectors such as the Alpha Magnetic Spectrometer”. We 
have shown that +He exists, and have measured its rate of production 
in nuclear interactions, providing a point of reference for possible 
future observations in cosmic radiation. Barring one of those dramatic 
discoveries mentioned above or a new breakthrough in accelerator 
technology, it is likely that “He will remain the heaviest stable 
antimatter nucleus observed for the foreseeable future. 
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Sharply increased mass loss from glaciers and ice 
caps in the Canadian Arctic Archipelago 


Alex S. Gardner!?, Geir Moholdt**, Bert Wouters”, Gabriel J. Wolken®, David O. Burgess’, Martin J. Sharp’, J. Graham Cogley’, 


Carsten Braun? & Claude Labine!® 


Mountain glaciers and ice caps are contributing significantly to pre- 
sent rates of sea level rise and will continue to do so over the next 
century and beyond' >. The Canadian Arctic Archipelago, located off 
the northwestern shore of Greenland, contains one-third of the glo- 
bal volume of land ice outside the ice sheets®, but its contribution to 
sea-level change remains largely unknown. Here we show that the 
Canadian Arctic Archipelago has recently lost 61 + 7 gigatonnes per 
year (Gt yr‘) of ice, contributing 0.17 + 0.02 mmyr‘ to sea-level 
rise. Our estimates are of regional mass changes for the ice caps 
and glaciers of the Canadian Arctic Archipelago referring to the years 
2004 to 2009 and are based on three independent approaches: surface 
mass-budget modelling plus an estimate of ice discharge (SMB+ D), 
repeat satellite laser altimetry (ICESat) and repeat satellite gra- 
vimetry (GRACE). All three approaches show consistent and large 
mass-loss estimates. Between the periods 2004-2006 and 2007-2009, 
the rate of mass loss sharply increased from 31+ 8Gtyr' to 
92 + 12 Gtyr' in direct response to warmer summer temperatures, 
to which rates of ice loss are highly sensitive (64 + 14Gtyr _' per 1K 
increase). The duration of the study is too short to establish a long- 
term trend, but for 2007-2009, the increase in the rate of mass loss 
makes the Canadian Arctic Archipelago the single largest contri- 
butor to eustatic sea-level rise outside Greenland and Antarctica. 

Several long-term records (about 50 years) of the surface mass budget 
(surface accumulation minus surface ablation) of individual glaciers and 
ice caps exist for the Canadian Arctic Archipelago (CAA, see Fig. 1)’*, 
but extrapolation of these records to estimate the mass budget of the 
entire region introduces a large uncertainty. Repeat airborne laser alti- 
metry surveys have been used to estimate that the glaciers of the CAA 
lost 23 Gtyr ' of ice between spring 1995 and spring 2000 (ref. 9). This 
represents 0.063 mm yr__' of sea-level rise if we take the global area of the 
ocean to be 362.5 X 10°km” (ref. 10). Since 2000 the CAA has experi- 
enced some of the warmest summer temperatures on record, with four 
of the five warmest years since 1960 occurring after 2004 (Supplemen- 
tary Information). Between 2005 and 2009 all CAA glaciers with 
long-term monitoring programmes”® experienced their most negative 
five-year period of surface mass budget since measurements began in the 
early 1960s. Here we present three independent estimates of change in 
total glacier mass between autumn 2003 and autumn 2009 for the 
northern CAA (Fig. 1; area 106,400 km?) and two independent estimates 
for the southern CAA (Fig. 1; area 42,000 km’). 

The first estimate is derived using a numerical model that simulates 
the regional mass change resulting from the surface mass budget. Ice 
discharge due to the calving of icebergs from glaciers that terminate in 
the sea, denoted D, is added to the surface mass-budget model results 
to account for the total regional ice loss (model SMB+ D) (Supplemen- 
tary Information). The model is not applied to the southern CAA 
because there are too few records of glacier mass budget and near- 
surface temperature with which to calibrate the model. The second 


estimate derives mass change from the change in land-ice volume 
measured using repeat laser altimetry from the Ice, Cloud and Land 
Elevation Satellite (ICESat)''. The third estimate is derived using 
repeat gravity observations collected by the Gravity Recovery and 
Climate Experiment (GRACE) satellites. The three methods are inde- 
pendent and produce consistent estimates of changes in glacier mass 
for the years 2004 to 2009 (Fig. 2), where each year refers to the mass- 
budget year starting in the autumn of the previous calendar year. All 
estimates are given as the mean +20 (95% confidence interval). 

In general, the CAA receives low amounts of precipitation (100- 
300kgm *yr ') with locally higher rates (300-1,000kgm *yr’ ’) 
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Figure 1 | Glaciers and ice caps of the Canadian Arctic Archipelago. Black 
dashed lines delineate the northern and southern study regions. The main panel is 
an enlargement of the red rectangle superimposed on the map of the Arctic (inset). 
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Figure 2 | Cumulative change in glacier mass between autumn 2003 and 
autumn 2009. Separate estimates are provided for the northern (a) and 
southern (b) CAA. Error bars represent the 95% confidence interval. 


concentrated on the east-facing slopes flanking Baffin Bay (Fig. 1). 
Surface air temperatures over ice masses in the region exceed the 
freezing point during only two to three months of the year. Because 
there is generally low interannual variability in precipitation and high 
variability in melt production, interannual variability in the regional 
surface mass budget is largely governed by changes in the summer 
surface energy budget’. These are strongly correlated with summer 
surface air temperatures’*"*, which are, in turn, highly dependent on 
local synoptic conditions’*'®. In this study we apply a surface mass- 
budget model that determines surface melt using the temperature-index 
method’””’. The model is forced with downscaled”” and bias-corrected 
temperature and precipitation fields from the National Centers for 
Environmental Prediction/National Center for Atmospheric Research 
reanalysis (Supplementary Information). For the years 2004 to 2009 the 
modelled mass loss from the surface mass budget (SMB) plus ice dis- 
charge (D), where D = 4.6 + 1.9Gt yr! (Supplementary Information), 
of the northern CAA was 34 + 13 Gtyr_‘ (Fig. 3). The average mass loss 
from the northern CAA was 7 + 18Gtyr | for the years 2004 to 2006, 
increasing to 61 + 18Gtyr ' for the years 2007 to 2009 with a peak loss 
of 79 + 30Gtyr | in 2008. The difference between the two periods is 
primarily due to a 42Gtyr ’ increase in melt production, which 
resulted from regionally warmer summer air temperatures in the lower 
troposphere. Warmer temperatures also contributed to a 7% decrease in 
snow fraction. A slight decrease in annual precipitation amount, and 
changes in the amount of meltwater retained by the annual snowpack, 
contributed another 12 Gt yr ' to the increased mass loss. 

For both the northern and southern CAA, we derived elevation 
changes from ICESat’s Geoscience Laser Altimeter System (GLAS) 
for the period 2003-2009 (ref. 20). Elevation changes are estimated 
relative to rectangular planes that are fitted to 700-m-long segments of 
near-repeat-track data’. The planes represent a simplified surface 
topography such that multi-temporal elevation measurements that 
are slightly offset in location can be compared. We then extrapolate 
elevation changes to volume changes and convert them to mass 
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Figure 3 | Modelled surface mass budget of the northern CAA between 
autumn 2003 and autumn 2009. The model resolution of 0.5 km allows us to 
resolve the highly negative surface mass budgets of the outlet-glacier tongues. 


changes using a plausible range of firn and ice densities (Supplemen- 
tary Information). For the years 2004 to 2009, ICESat results show that 
the northern CAA lost 37 + 7 Gtyr ' and that the southern CAA lost 
24+ 6Gtyr '. ICESat results show increases in mass loss between 
2004-2006 and 2007-2009 of 39 Gtyr_' and 14Gtyr ' forthe northern 
and southern CAA, respectively. Recent observations in both Alaska’ 
and Greenland” have found that marine-terminating glaciers are 
thinning more rapidly than land-terminating glaciers. To assess 
whether the same phenomenon is occurring in the CAA, we separately 
determined elevation changes for marine- and land-terminating 
glacier basins (Supplementary Information). Our results show no dif- 
ference in the area-averaged rate of elevation change between the two 
basin types, suggesting that total ice discharge from marine-termin- 
ating glaciers has not accelerated in recent years. This gives increased 
confidence in both the extrapolation of ICESat elevation changes and 
our estimate of ice discharge. 
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Lastly, we derived mass changes for both the northern and southern 
CAA from GRACE gravity measurements. Mass-change estimates 
from GRACE agree very well with the other two data sets for the 
northern CAA, with an average mass loss between 2004 and 2009 of 
39 +9Gtyr '. The observations confirm the sharp increase in northern 
CAA mass loss between 2004-2006 and 2007-2009, with an increase in 
the average mass loss of 60 Gtyr _'. The southern CAA is estimated to 
have lost ice at an average rate of 24+ 7 Gtyr ‘ over the six-year study 
period, with a 16Gtyr ‘ increase in the rate of loss between the first 
three and last three years, and is in very good agreement with ICESat. 
The most likely sources of the disagreement between the three methods 
are: uncertainties in constraining the terrestrial water storage in the 
GRACE estimates, the identification of the appropriate end-of- 
season mass change in the GRACE signal, and fewer ICESat elevation 
retrievals in 2009 (Supplementary Information). 

The error-weighted mean of all mass-change estimates gives a total 
mass loss for the CAA of 368 + 41 Gt or 1.01 + 0.11 mm sea-level rise 
for the years 2004 to 2009. Most of the mass loss came from the northern 
CAA, which lost 224 + 30 Gt, with the remaining 144 + 28 Gt coming 
from the southern CAA (see Supplementary Figs 1-3 for a further 
subdivision of the mass losses within the northern and southern 
CAA). We estimate that the majority of the mass loss (about 92%) is 
due to meltwater runoff, with a much smaller contribution coming 
from ice discharge from marine-terminating glaciers (about 8%). 
Three-quarters of all mass loss occurred in the last three years of the 
observation period with an average loss of 92+12Gtyr ', or 
0.25 + 0.03 mm yr’ sea-level rise. This rate is four times greater than 
the estimated mass loss for CAA over the period 1995 to 2000 (ref. 9). 

This increase in mass loss is in direct response to warmer surface air 
temperatures in summer, to which the glaciers of the CAA have a high 
sensitivity. Over the six-year period of our study an additional 
64+ 14Gtyr | of ice was lost to the oceans for every 1 K rise in mean 
summer surface air temperature. Dividing by the total glacier area gives 
an area-averaged temperature sensitivity of —430 + 90kgm *yr 1K +, 
which is two times larger than estimated from glacier surface mass- 
budget records***”* and is close to sensitivities estimated from regional 
climatology’. The sensitivity to precipitation is much smaller; a 10% 
increase in precipitation would result in a mass gain of only about 
5Gtyr '. Such a low sensitivity to precipitation is in contrast to gla- 
ciers located in wet maritime regions. For example a 10% increase in 
precipitation over the Patagonia icefields, which have a combined ice 
area that is one-tenth the size of the CAA, would result in a 12 Gt yr! 
gain of mass”°. 

To put the mass losses occurring in the CAA into a global per- 
spective, the Patagonia icefields lost ice at an average rate of 
28+ 11Gtyr ' between April 2002 and December 2006 (ref. 27) with 
little change in the ice-loss trend for the years 2007 to 2009 (J. Chen, 
personal communication). The glaciers of the Gulf of Alaska lost mass 
at an average rate of 88 + 15Gtyr ' for the years 2004 to 2006, slow- 
ing to 70+ 11Gtyr ’ for the years 2007 to 2009 (update to ref. 28). 
The sharp increase in mass loss from the CAA and the slowdown in 
loss from the Gulf of Alaska makes the CAA the largest contributor to 
eustatic sea level rise outside Greenland and Antarctica for the years 
2007-2009. Because of the high sensitivity to temperature and low 
sensitivity to precipitation, the CAA is expected to continue to be 
one of the largest contributing regions to eustatic sea level rise well 
into the next century and beyond’. 


METHODS SUMMARY 


The surface mass-budget model was run at a resolution of 500 m by 500 m for the 
period 1949 to 2009 (Supplementary Information). Model results are validated 
against observations and agree well with in situ point surface mass-budget measure- 
ments (Supplementary Fig. 4: r = 0.86, N = 3,717, standard error = 350kgm ”). 
For the four regions with well-established surface mass-budget measurement pro- 
grammes (Agassiz Ice Cap, north-western Devon Ice Cap, Meighen Ice Cap and 
White Glacier”*) the model has a very low bias (—18kgm 7 yr _') in the glacier- 
averaged surface mass budget (Supplementary Information). To be consistent with 
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the other data sets presented in this study, we discuss only mass changes modelled 
over the ICESat and GRACE operational period between autumn 2003 and 
autumn 2009. 

To recover mass changes from the GRACE measurements we use forward model- 
ling of mass changes in predefined basins, minimizing the least-squares difference 
between GRACE observations and the forward model in an iterative method 
(Supplementary Information and refs 29 and 30). To avoid biases from surrounding 
areas (Supplementary Fig. 1) as a result of the limited spatial resolution and integral 
character of the GRACE observations, mass changes are modelled for the Greenland 
Ice Sheet and other areas surrounding the CAA. GRACE measurements were made 
available by the Center for Space Research (CSR version RL04) and were down- 
loaded from http://podaac.jpl.nasa.gov/DATA_CATALOG/graceinfo.html. 

More details about the data and methods can be found in the Supplementary 
Information. 
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Melting of the Earth’s inner core 


David Gubbins!'”, Binod Sreenivasan’, Jon Mound! & Sebastian Rost! 


The Earth’s magnetic field is generated by a dynamo in the liquid 
iron core, which convects in response to cooling of the overlying 
rocky mantle. The core freezes from the innermost surface out- 
ward, growing the solid inner core and releasing light elements 
that drive compositional convection’ *. Mantle convection extracts 
heat from the core at a rate that has enormous lateral variations*. 
Here we use geodynamo simulations to show that these variations 
are transferred to the inner-core boundary and can be large enough 
to cause heat to flow into the inner core. If this were to occur in the 
Earth, it would cause localized melting. Melting releases heavy 
liquid that could form the variable-composition layer suggested 
by an anomaly in seismic velocity in the 150 kilometres immedi- 
ately above the inner-core boundary” ’. This provides a very simple 
explanation of the existence of this layer, which otherwise requires 
additional assumptions such as locking of the inner core to the 
mantle, translation from its geopotential centre’* or convection 
with temperature equal to the solidus but with composition vary- 
ing from the outer to the inner core’. The predominantly narrow 
downwellings associated with freezing and broad upwellings assoc- 
iated with melting mean that the area of melting could be quite 
large despite the average dominance of freezing necessary to keep 
the dynamo going. Localized melting and freezing also provides a 
strong mechanism for creating seismic anomalies in the inner core 
itself, much stronger than the effects of variations in heat flow so 
far considered”. 

The core responds passively to the non-uniform heat flow imposed by 
the mantle: it plays a purely passive role in this coupled convective 
system. Variations in heat flux around the core-mantle boundary 
(CMB), created by mantle convection, are likely to be large. They can 
be estimated by two independent methods, one using seismic tomo- 
graphy" within the supposed thermal boundary layer at the base of 
the mantle, and the other using mantle convection studies*. Both suggest 
variations comparable with the average heat flux. Inhomogeneous 
boundary conditions can produce enormous effects on core convec- 
tion’, and when background convection is small the boundary varia- 
tions can aid magnetic field generation through enhanced helical 
motions in fluid columns'*. Many geodynamo simulations have incor- 
porated thermal boundary conditions based on seismic tomography to 
explain the non-axisymmetric time average of the geomagnetic field'*"*, 
low secular variation in the Pacific’®”’, frequency of polarity reversals”, 
and persistent polarity transition paths during reversals”. 

We have explored the heat flux variability on the inner-core bound- 
ary (ICB) using numerical geodynamo calculations driven by thermal 
convection with an inhomogeneous upper boundary heat flux and 
constant lower boundary temperature. The details of our dynamo 
model are given in the Methods. Examples using the ‘tomographic 
boundary condition" suffice to illustrate the possibility of inward heat 
flow at the lower boundary. The important parameter q* = (qmax — 
Qmin)/2Gmean Measures the strength of the lateral variation in CMB 
heat flux relative to the average; a range from q* = 0.15 to 0.45 gives 
dynamos that vary from one relatively unaffected by the boundary 
condition to one where the magnetic field is almost stationary, or 
statistically ‘locked’ to the boundary”. 


Figure 1 gives the heat flux distribution on the upper and lower 
boundaries for a locked dynamo at q* = 0.45. The pattern of heat flux 
on the ICB mirrors that on the CMB; negative patches of heat flux 
indicate heat flow into the inner core at sites of melting if this were part 
of the model. Figure 2 shows two snapshots and a time average for a 
dynamo with q* = 0.15; again there are patches where the heat flux is 
negative despite the weaker lateral variations. Upwellings in the outer 
core are broad while downwellings are narrow and vertical in all these 
dynamos (Fig. 3), producing concentrated patches of high ICB heat 
flux immediately beneath high CMB heat flux. The regions of melting 
are therefore relatively large in comparison with the total amount of 
melting. We note, however, that dynamo models with different oper- 
ating parameters and buoyancy profiles need not produce heat flowing 
into the lower boundary: a weakly convecting regime in which lateral 
variations at the upper boundary are allowed to propagate all the way 
to the lower boundary appears to be the most conducive for inner-core 
melting. 

Three complications must be taken into account when applying the 
results of a thermal geodynamo simulation to the Earth. The first is the 
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Figure 1 | Effect of mantle inhomogeneity on heat flux distribution at the 
inner core surface. Heat fluxes are applied to the upper boundary (a) and 
calculated on the constant-temperature lower boundary (b) in a geodynamo 
simulation where the flow is strongly coupled to the boundary thermal 
anomalies (q* = 0.45). The range of heat flux across the upper boundary ranges 
from 0.77 to 2.16 dimensionless units outwards and across the lower boundary 
ranges from —0.51 to 2.89 dimensionless units (negative values indicate heat 
flux into the inner core). This model uses an Ekman number 1.2 X 10 4, 
Rayleigh number 1.5 times the critical value for onset of convection, Prandtl 
number 1 and magnetic Prandtl number 10. (See the Methods section for 
definitions of these dimensionless numbers.) 
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Figure 2 | Calculated heat flux on the lower boundary of a geodynamo 
model where q* = 0.15 for the upper boundary heat flux. Panels a and b are 
snapshots and c shows the time average over several magnetic diffusion times. 
Heat fluxes range from —0.287 to 2.126 (a), —0.124 to 1.976 (b) and —0.276 to 
1.86 (c) for the time average. The parameters used in this model are the same as 
in Fig. 1. 


heat conducted down the adiabat. This was omitted from a recent 
mantle convection study that explored the effects of a postperovskite 
layer and variations in chemical composition on the heat flux across 
the CMB and its correlation with seismic shear wave velocity*. 
Postperovskite makes little difference to heat fluxes but lateral varia- 
tions in composition, such as a subducted slab lying on the CMB, 
greatly increase the ratio q*. To apply these results to core convection 
we must first subtract the heat conducted down the adiabatic temper- 
ature gradient. Typical estimates of the adiabatic gradient at the CMB 
(1K km!) and core thermal conductivity (k = 50 W m 'K'') give a 
conducted heat flux of 50mWm 7, comparable with qmean for the 
mantle convection calculations. Subtracting this raises the relevant 
q* dramatically because it reduces qmean to nearly zero while leaving 
the range max — Ymin unchanged. In fact there is nothing to stop q* 
becoming infinite, as it nearly does for the most realistic mantle model 
in the previous study* (model TC-3.6, which has a compressible py- 
roxene content), it merely means the top of the core is thermally 
neutral. Most dynamo simulations have been restricted to rather low 
q* because the dynamo tends to fail for large lateral heat flux varia- 
tions'*"*. In our models with internal heating the dynamo fails by 
q* ~1 but dynamos with basal heating and stratified upper layers 
continue to work for large q* (ref. 23). The upper region of the 
Earth’s fluid core is likely to be stably stratified, or at most only weakly 
convecting~*”*, and a high q* is therefore quite possible and appropri- 
ate for the Earth. Two factors are likely to increase q* with depth. First, 
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Figure 3 | Temperature (colour contours) and fluid flow (arrows) on the 
equatorial section for the statistically locked tomographic model 

(q* = 0.45). The lowest temperature is blue and the highest temperature is 
deep red. We note the narrow downwellings beneath cold regions (the two 
major ones coinciding with the ‘ring of fire’ around the Pacific) and broad 
upwellings (corresponding to the mid-Pacific and African superplume). This 
leads to relatively large areas of negative (melting) and low-positive heat flux on 
the ICB and relatively small areas of strong-positive heat flux (freezing). 


the adiabatic gradient weakens with depth by a factor of about three 
between the CMB and ICB. At the ICB the adiabatic heat flux must be 
added back on to the model results, reducing any heat flow into the 
inner core; however, the weakened adiabat makes this a relatively small 
effect. Second, narrow downwellings and the spherical geometry tend 
to concentrate the convected heat flux, increasing the lateral variations. 

The second complication to consider is compositional convection. 
The compositional gradient is neutral or stabilizing at the CMB 
(assuming no passage of light elements into the mantle): convection 
at the top of the core is purely thermal. Compositional buoyancy tends 
to dominate thermal buoyancy deeper in the outer core, particularly 
near the ICB, as the following calculation shows. The buoyancy force is 
p(%c + w#,T)g, where p is the density, g the acceleration due to gravity, 
oy the thermal expansion coefficient and «,. the compositional expan- 
sion coefficient. Compositional changes therefore have the thermal 
equivalent «c/a ;. Comparing heat and mass fluxes in the respective 
diffusion equations show that the conversion factor is Cpa,/ay, 
where C, is the specific heat. Freezing 1 kg of liquid at the ICB releases 
L joules of latent heat and pc kilograms of mass with thermal equi- 
valent C,%.c/ay joules. The effective buoyancy ratio is therefore 
C,oc/Lay = 2.3 for a concentration c = 0.0252, corresponding to a 
density jump at the ICB of 0.6gcm * (from PREM”) assuming that 
0.34gcm* of this comes from the solid-liquid phase transition for 
pure iron’’. Compositional buoyancy dominates and will be even lar- 
ger for larger ICB density jumps: 4.1 for 0.8gcm * and 5.8 for 1.0g 
cm °. Thus temperature variations are relatively unimportant in the 
buoyancy force near the ICB but are crucial in determining the rate of 
freezing, and therefore the supply of buoyancy through the release of 
light elements. Lateral variations in temperature imposed by the upper 
boundary will be carried down to the ICB by compositional convec- 
tion, assisted by thermal convection, so we expect the variations on the 
ICB observed in the thermal or codensity geodynamo simulations to be 
sustained in a thermo-chemical system. 

The third complication is the possible dynamic consequences of the 
variable-composition layer. The density gradient across the layer of 
freshly melted, heavy liquid is vastly steeper than anything arising 
from convection in the main part of the outer core: a density change 
of the order of 0.1 gem’ * across a 150-km layer compared to a typical 
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convective density fluctuation of 10° gcm ° or less across a compar- 
able or longer length scale, as estimated from a buoyancy—Coriolis 
force balance near the ICB**. Such a steep density gradient would 
prevent downwellings from reaching the ICB, but plumes of light 
material produced by freezing would rise through it, drawing heavy 
liquid along the ICB towards the regions of freezing and maintaining 
mixing of the variable-composition layer. Laboratory experiments 
suggest that the plumes could mix the layer if the melting exceeds 
20% of the freezing*, but the plumes on the ICB are determined by 
thermal, not compositional, effects. Further study is needed to under- 
stand the influence of this layer. 

Regional melting of the inner core that results from heat flux varia- 
tions at the CMB provides the simplest explanation of the observed 
variable-composition layer at the base of the outer core. It also provides 
a strong mechanism for seismic anomalies in the solid inner core itself 
because areas of melting will consist of recently exposed, precom- 
pressed material whereas areas of freezing will have layers of recently 
formed, unconsolidated mush. Variations in heat flux have already 
been invoked to explain seismic anomalies inside the inner core” 
but actual melting will produce even stronger effects’. In both cases, 
any correlation with mantle anomalies and persistence of locality 
requires the inner core and, to some extent, the core flow to be locked 
to the mantle. If these observations hold up to further scrutiny—in 
particular, if the variable-composition layer turns out not to require 
inner-core locking—they will provide important constraints on core 
evolution, convection and the dynamo. 


METHODS SUMMARY 


We consider a thermal convection-driven dynamo operating in an electrically 
conducting fluid. The Earth’s outer core is modelled as a spherical shell confined 
between a solid iron inner core of radius r, and an insulating mantle at radius r,. 
The radius ratio rj/r, is taken to be that of the Earth, 0.35. In the Boussinesq 
approximation”, the time-dependent, three-dimensional magnetohydrodynamic 
equations for the velocity u, the magnetic field B and the temperature T are solved 
numerically. The governing equations and numerical method are described in the 
Methods. The inner boundary in the model is considered to be at a fixed temper- 
ature, whereas the outer boundary is subject to a lateral variation in heat flux that 
has the same structure as the seismic shear-wave velocity variation in the lower 
mantle’. This assumes shear velocity is determined by temperature and not by 
composition. The dominant pattern is a fast (cold) ring around the Pacific rim and 
slow (hot) regions beneath the Pacific and Africa (see Fig. 1). 

The parameter regime used in this paper has been considered in the study of a 
boundary-locked dynamo'!*””. The Ekman number is kept sufficiently small to 
make the dynamics rotationally dominant, and the Rayleigh number is chosen 
such that free convection does not swamp the effect of the CMB lateral inhomo- 
geneity. When the heat flux inhomogeneity ratio q* is sufficiently large, this regime 
is characterized by a boundary-driven thermal wind balance", that is, a balance 
between the lateral buoyancy and Coriolis forces. This force balance causes the 
narrow downwellings to remain locked at preferred longitudes, which can explain 
the quasi-stationary, non-axisymmetric flux patches in today’s geomagnetic field. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Numerical dynamo model. We consider a thermal convection-driven dynamo in 
which an electrically conducting fluid is confined between two concentric, co- 
rotating spherical surfaces. The radius ratio r;/r, is chosen to be that of the Earth, 
0.35. In the Boussinesq approximation”, the time-dependent, three-dimensional 
magnetohydrodynamic equations for the velocity u, the magnetic field B and the 
temperature T are solved numerically*’. The governing dimensionless equations 
are: 


E (ou é =i 
pa (ap HW % uw) <u) 422 x w= —Vp+Ra PP Tr+ 


(V x B) x B+EV7u 


3p VX (ux B)+V°B (2) 
oT -1y72 

op t (ur V)T=PmPr WT (3) 

V-u=V-B=0 (4) 


The dimensionless groups in the above equations are the Ekman number, 
E=y/2QD", the Prandtl number, Pr=v/«, the magnetic Prandtl number, 
Pm = v/7 and the ‘modified’ Rayleigh number Ra = guh;D°/2Qx, which is the 
product of the conventional Rayleigh number and the Ekman number. The defi- 
nition of the Rayleigh number depends on the basic state (conductive) profile in 
the model (see below). In the above dimensionless groups, v is the kinematic 
viscosity, « is the thermal diffusivity, 1 is the magnetic diffusivity, D is the gap- 
width of the spherical shell, Q is the angular velocity of rotation, g is the gravita- 
tional acceleration, « is the coefficient of thermal expansion and fj; is a constant 
that determines the basic state temperature profile, Ty. The Ekman number is a 


measure of the rotation rate and the Rayleigh number represents the strength of 
convective buoyancy in the problem. Our models use an Ekman number of 
E=1.2X 10“, a Rayleigh number of 1.5Ra,, where Ra, is the critical Rayleigh 
number for onset of nonmagnetic convection, a Prandtl number Pr=1 and 
magnetic Prandtl number Pm = 10. 

No-slip boundary conditions are imposed on the flow at the ICB and at the CMB. 
The inner core is considered to be ata fixed temperature and electrically conducting. 
The isothermal condition at the ICB is reasonable for a solid core of high thermal 
conductivity. However, compositional buoyancy in the form of light-element 
release over areas of freezing can complicate the boundary condition at the ICB. 
The upper boundary in the model is maintained electrically insulating to mimic the 
mantle and subject to a lateral variation in heat flux that has the same structure as 
the seismic shear-wave velocity variation in the lower mantle. The basic state 
temperature profile imposed in the model represents a uniform distribution of heat 
sources, and is given by To(r) = 8, (7? —r*) /2, where r; is the inner radius and f; is 
related to a prescribed, uniform heat source Q, as follows: B; = Q,/3K. 

The velocity and magnetic field vectors are expressed in terms of poloidal and 
toroidal scalars, as follows: 


u=V x V x [Par] +V x [Tur] 
B=V x V x [Pgr|+V x [Ter] 


(5) 


whereby the continuity equations (4) are satisfied. The standard numerical 
method used here involves expanding the above four scalar variables and the 
temperature T in spherical harmonics in latitude 0 and longitude #, and time- 
stepping the spectral coefficients. Finite differences are used in the radial direction. 
The numerical integration of the equations is performed for at least five magnetic 
diffusion times. 


31. Sreenivasan, B. & Jones, C. A. The role of inertia in the evolution of spherical 
dynamos. Geophys. J. Int. 164, 467-476 (2006). 
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Eocene lizard from Germany reveals amphisbaenian 


origins 


Johannes Miiller!, Christy A. Hipsley’?, Jason J. Head’, Nikolay Kardjilov*, André Hilger*, Michael Wuttke’ & Robert R. Reisz® 


Amphisbaenia is a speciose clade of fossorial lizards characterized 
by a snake-like body and a strongly reinforced skull adapted for 
head-first burrowing’. The evolutionary origins of amphisbae- 
nians are controversial, with molecular data uniting them with 
lacertids**, a clade of Old World terrestrial lizards, whereas mor- 
phology supports a grouping with snakes and other limbless squa- 
mates’. Reports of fossil stem amphisbaenians’® have been 
falsified"', and no fossils have previously tested these competing 
phylogenetic hypotheses or shed light on ancestral amphisbaenian 
ecology. Here we report the discovery of a new lacertid-like lizard 
from the Eocene Messel locality of Germany that provides the first 
morphological evidence for lacertid-amphisbaenian monophyly 
on the basis of a reinforced, akinetic skull roof and braincase, sup- 
porting the view that body elongation and limblessness in amphis- 
baenians and snakes evolved independently. Morphometric analysis 
of body shape and ecology in squamates indicates that the postcra- 
nial anatomy of the new taxon is most consistent with opportunis- 
tically burrowing habits, which in combination with cranial 
reinforcement indicates that head-first burrowing evolved before 
body elongation and may have been a crucial first step in the evolu- 
tion of amphisbaenian fossoriality. 


Reptilia Laurenti, 1768 
Squamata Oppel, 1811 
Lacertibaenia Vidal and Hedges, 2005 
Cryptolacerta hassiaca gen. et sp. nov. 


Etymology. Crypto-, from the ancient Greek xputc, meaning ‘hid- 
den’ or ‘secret’, referring to the inferred ecology of the animal; lacerta 
(Latin), meaning lizard; hassiaca (Latin), female adjective for Hesse, 
the German province of the Messel locality. 

Holotype. SMF ME 2604 (Fig. 1), Forschungsinstitut und 
Naturmuseum Senckenberg, Frankfurt, Germany. 

Locality and horizon. West of Quarry 2, 50 cm above Level B'”. Messel 
Pit World Heritage Site, Hesse, Germany; Eocene (Lutetian). 
Diagnosis. Lacertibaenian squamate with a snout-vent length of 
approximately 7 cm; skull capsule-like, anteriorly downturned and 
heavily ossified; transverse nasofrontal suture; small narial openings 
facing strictly anteriorly owing to a unique dorsolateral covering by 
the maxilla; small posterodorsal coronoid process of the dentary; 14 
dentary, 7 premaxillary and 12 maxillary teeth with the posterior-most 
maxillary tooth enlarged; 27 presacral vertebrae; manus and pes 
strongly reduced in size relative to the remaining limb. Shares with 
amphisbaenians a relatively elongated postorbital skull portion, blunt 
and rounded snout, sutural contact between prefrontal and postorbito- 
frontal, contact between prefrontal and jugal, absence of a lacrimal, 
small jugal with only little angulation, subequal width of the anterior 
and posterior borders of the frontal, absence of frontal constriction 
between the orbits, loss of the tympanic crest, neural spines reduced, 
seven or fewer cervical vertebrae, rod-like clavicles, absence of an 


anterior coracoid emargination and interclavicle, fusion of cephalic 
scales, transversely widened frontal subolfactory processes, thickening 
of maxilla and frontal, small orbits, a vertical tongue-and-groove articu- 
lation between the frontals, and absence of an iliac anterodorsal 
projection. 

The type and only known specimen of Cryptolacerta hassiaca is 
nearly complete, missing only the distal tail (Fig. 1a). Computed tomo- 
graphy (CT) imaging and specimen examination (Figs 1b, c and 2) 
reveal a mosaic of lacertid and amphisbaenian anatomical characters. 
The skull is massive and heavily ossified, with an anteroventrally 
downturned anterior portion (Fig. 1d). Extensive dermal sculpturing 
covers the skull roof and well preserved scute sulci reflect the presence 
of large, transversely oriented scales. Both maxilla and frontals display 
a massive thickening in cross-section coupled with an increase in bone 
density, obscuring the vascularized internal structure seen in the more 
posterior cranial elements as well as in lacertids and most other lizards; 
the same condition in Cryptolacerta occurs in amphisbaenians. The 
external nares are small and anteriorly oriented and are bounded 
dorsally by a unique anteromedial flange of the maxilla. Small orbits 
indicate reduced eyes, and the prefrontal and postfrontal have a strong 
sutural contact similar to fossil amphisbaenians'*"’. Cryptolacerta has 
a vertically tall tongue-and-groove interdigitation of the median con- 
tact of the paired frontals as in amphisbaenians (Fig. 2a), and the 
prominent frontal subolfactory processes, although lacking a median 
contact as in lacertids, are notably widened transversely and form the 
major part of the posterior wall of the nasal capsule, a feature shared 
with amphisbaenians (see Supplementary Information). The parietal 
table is prominent and shows the typical lacertid Y-shaped crest that 
articulates with the braincase on its ventral surface (Fig. 2b). The large 
size and ventral extension of the crest indicates close proximity or 
ossification with the prootic and supraoccipital, resulting in reduced 
cranial kinesis. The braincase is crushed, and only the parabasisphe- 
noid and slender basipterygoid articulations are preserved. The middle 
ear is reduced as evidenced by the absence of a quadrate tympanic crest 
(Fig. 2c). The dentary has a posterolateral extension covering the ante- 
rolateral part of the coronoid (Fig. 1c), as in many amphisbaenians, 
despite retaining a typically lacertid shape. 

Cryptolacerta possesses a distinctive heterodont dentition. The six 
preserved teeth on the premaxilla are conical and diminutive. The 11 
maxillary teeth continuously increase in size posteriorly, with the last 
tooth being expanded and bulbous in shape and the remaining teeth 
having bicuspid crowns. The 14 teeth on the dentary also are similar to 
the maxillary teeth, but lack an enlarged posterior-most tooth. 

Postcranially, Cryptolacerta possesses 29 procoelous precaudal 
vertebrae with very low neural spines, including seven cervicals and 
two sacrals. The pectoral girdle consists of recurved clavicles, slender 
scapulacoracoids and the sternum, whereas the interclavicle is absent 
(Fig. 2d). The pelvic girdle possesses a well-developed ilium that lacks 
an anterodorsal process (Fig. 2e). Although not all autopodia are fully 
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Figure 1 | Cryptolacerta hassiaca gen. et sp. nov., holotype (SMF ME 2604). 
a, Nearly complete specimen. b, Micro-CT scan of skull and outline of bones in 
dorsal view. c, Micro-CT scan of skull and outline of bones in ventral view. 
d, Reconstruction of skull in dorsal and lateral view. Scale bars, 5 mm (a), 2mm 
(b, c). Abbreviations: an, angular; ar, articular; c, coronoid; d, dentary; ec, 
ectopterygoid; f, frontal; hy, hyoid; j, jugal; m, maxilla; n, nasal; p, parietal; pb, 
palpebral; pbs, parabasisphenoid; pf, prefrontal; pob, postorbital; pof, 
postfrontal; pm, premaxilla; pra, prearticular; pt, pterygoid; q, quadrate; sa, 
surangular; so, supraoccipital; sq, squamosal; st, supratemporal. 


Figure 2 | Cryptolacerta hassiaca gen. et sp. nov., holotype (SMF ME 2604), 
anatomical features as revealed by CT. a, Transverse section through the 
anterior part of the frontals (f). b, Parietal in ventral view showing the Y-shaped 
crest. c, Left quadrate in posterolateral view. d, Section showing shoulder girdle 
with sternum (st), scapulocoracoid (sc) and clavicle (cl). e, Pelvic girdle with 
outlines to emphasize the morphology of ischium (is), ilium (il) and pubis (pu). 
f, Manus with digits I-V; note that digit IV lies on top of digit V, as revealed by 
different CT sections. Scale bars, 1 mm. 


preserved, the phalangeal formula of 2-3-4-4/5?-3 suggests that no 
digits are lost, but the phalangeal elements are miniaturized relative to 
the remaining limb bones (Fig. 2f). 

The systematic position of amphisbaenians within Squamata is 
poorly constrained. Molecular data support a sister-taxon relationship 
with lacertids**'*'’, but there is no morphological character support 
among living taxa uniting the highly derived amphisbaenians with 
lacertids. Most morphological analyses support a common ancestry 
of amphisbaenians and snakes*°, but character support for this 
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hypothesis has been considered homoplastic'*®. To determine the sig- 
nificance of Cryptolacerta for resolving systematic relationships of 
amphisbaenians, we performed a phylogenetic analysis on a combined 
data set of morphological characters and nuclear gene sequences (rag- 
1, c-mos) for extant and fossil squamates using parsimony and 
Bayesian methods (Fig. 3a). Analyses of combined data recover a 
monophyletic lacertid~amphisbaenian (‘lacertibaenian”) clade, with 
Cryptolacerta clustering as sister taxon to Amphisbaenia in both the 
parsimony and Bayesian analyses (Fig. 3a). The sister relationship with 
Amphisbaenia is supported by 19 characters distributed across the 
entire skeleton (see Supplementary Information). Although homoplasy 
is common in many squamate osteological characters””*, the tongue- 
and-groove articulation of the frontals is unique to Cryptolacerta and 
Amphisbaenia, the transversely widened frontal downgrowths occur 
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Figure 3 | Phylogeny of Cryptolacerta and the evolution of cranial akinesis 
in the origin of the amphisbaenian skull. a, Time-calibrated phylogeny of 
Squamata based on Bayesian analysis of morphological and molecular 
characters. Bold taxa with asterisks are from the Messel locality; the blue box 
denotes Lacertibaenia. b, Evolution of the amphisbaenian skull. Dorsal skull 
roofs and transverse sections through the dorsal braincase of (from left to right) 
Podarcis pityusensis (Lacertidae), Cryptolacerta hassiaca gen. et sp. nov. and 
Rhineura floridana (Amphisbaenia)*’. The ventroparietal crest of lacertids is 
ventrally connected with the prootic (blue) through a membranous sheet. In 
Cryptolacerta the crest is more prominent and must have had an extensive 
contact with the prootic. In amphisbaenians the crest is in full contact with the 
prootic and forms a secondary temporal region. 
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otherwise only in some scolecophidian snakes, the thickened frontal 
and maxilla can be found to a variable extent only in dibamids, snakes 
and varanids, the absence of a tympanic crest and the very low neural 
spines otherwise occur only within fossorial snakes, dibamids and some 
anguimorphs, and a sutural prefrontal—postorbitofrontal contact is 
shared only by some anguimorphs and Sineoamphisbaena. This last 
taxon, a Mesozoic squamate previously considered a stem-amphisbae- 
nian”, falls within Polyglyphanodontidae as in other recent analyses" 
(Fig. 3a). 

Cranial osteology of Cryptolacerta provides the first evidence of the 
origin of the derived amphisbaenian skull (Fig. 3b). In both Cryptolacerta 
and amphisbaenians the skull is reinforced by a strong vertical inter- 
digitation of the frontals, thick, dense maxillae and frontals, and ventral 
downgrowth of the parietal. In lacertids the anterolateral portions of 
the ventroparietal crest closely approach the membranous alar pro- 
cesses of the prootic, and in amphisbaenians a membranous extension 
of the prootic is sutured to the ventrally extending parietal (Fig. 3b). 
Although the crest in Cryptolacerta is similar to lacertids, it is much 
more strongly developed and we infer extensive contact between the 
parietal and prootic. Additionally in basal amphisbaenians, the dorsal 
outline of the parietal table strongly reflects the shape of the ventro- 
parietal crest of Cryptolacerta'*"'’, suggesting that the lateral parts of a 
lacertid-like parietal became reduced during amphisbaenian evolution. 

Body shape in squamates corresponds to locomotory habits’’”°, and 
the nearly complete skeleton of Cryptolacerta provides an opportunity 
to infer ecology near the origin of Amphisbaenia. To determine the 
habits of the taxon, we morphometrically analysed body shape in 
Cryptolacerta and extant squamates occupying habitats represented 
in the Messel depositional system (Fig. 4). Principal component 
analysis of cranial, axial and appendicular measurements’” produced 
morphospaces within which ecological habits were defined for extant 
taxa, and inferred for Cryptolacerta. For all coordinated principal com- 
ponent axes, Cryptolacerta falls outside the morphospace defined by fully 
fossorial squamates (Fig. 4). Although the reinforced skull and super- 
ficially small limbs are suggestive of fossorial habits, Cryptolacerta occu- 
pies a position within morphospace defined by taxa that are cryptic, leaf 
litter specialists and opportunistic burrowers (Fig. 4), based on relative 
body size, limb lengths and head size. 

Ecomorphometry of Cryptolacerta and adaptations for a reinforced 
skull indicate that the early ecology of amphisbaenians and their rela- 
tives consisted of cryptic behavioural habits combined with head-first 
substrate locomotion, possibly as a defensive or predation strategy. 
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Figure 4 | Ecomorphology of Cryptolacerta. Principal component analysis of 
squamate morphology with ecological habits projected into shape space. 
Fossorial and cryptic morphospaces are shaded. Cryptolacerta occupies a 
position within the cryptic and terrestrial habit spaces. 
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Body elongation and limb reduction or loss are often considered pre- 
requisites of fossoriality in squamates*’**; however, Cryptolacerta 
demonstrates that modifications to cranial architecture preceded post- 
cranial specializations in amphisbaenians, showing that hypotheses of 
ecological character correlation may mask radically different histories 
of character evolution in ecological specialization’*”. 

Recent molecular divergence estimates” and the fossil record’? indi- 
cate that lacertids and amphisbaenians diverged in the Late Cretaceous, 
at least 20 Myr before the occurrence of Cryptolacerta. The late occur- 
rence of Cryptolacerta is consistent with hypotheses that intermediate 
body forms in the evolution of body elongation and limblessness can 
persist for tens of millions of years”. It also suggests that the 
Palaeogene of Europe was a refugium for archaic Mesozoic squamate 
lineages, as indicated by co-occurring Eolacerta, ‘Saniwa’, and 
Ornatocephalus’”*** (Fig. 3a), probably resulting from the island geo- 
graphy of Europe during the Late Cretaceous and early Cenozoic”. 


METHODS SUMMARY 


The specimen was scanned by CT at the Helmholtz Centre Berlin for Materials and 
Energy using a micro-focus X-ray tube. Cranial reconstructions were performed 
using a wax model based on the CT data. Phylogenetic analyses were run using a 
partitioned data set of 364 morphological and 3,216 molecular (rag-1, c-mos) char- 
acters as well as 65 terminal taxa, using both parsimony and Bayesian methodology. 
Morphometric analysis used principal component analysis of a published data set of 
linear measurements of squamate body forms for taxa inhabiting environments 
represented in the Grube Messel depositional system’? to which Cryptolacerta was 
added based on measurements obtained from a digital caliper. 
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Species-—area relationships always overestimate 
extinction rates from habitat loss 


Fangliang He!* & Stephen P. Hubbell*4 


Extinction from habitat loss is the signature conservation problem 
of the twenty-first century’. Despite its importance, estimating 
extinction rates is still highly uncertain because no proven direct 
methods or reliable data exist for verifying extinctions. The most 
widely used indirect method is to estimate extinction rates by 
reversing the species—area accumulation curve, extrapolating back- 
wards to smaller areas to calculate expected species loss. Estimates 
of extinction rates based on this method are almost always much 
higher than those actually observed’ *. This discrepancy gave rise 
to the concept of an ‘extinction debt’, referring to species “committed 
to extinction’ owing to habitat loss and reduced population size but 
not yet extinct during a non-equilibrium period®’. Here we show 
that the extinction debt as currently defined is largely a sampling 
artefact due to an unrecognized difference between the underlying 
sampling problems when constructing a species-area relationship 
(SAR) and when extrapolating species extinction from habitat loss. 
The key mathematical result is that the area required to remove the 
last individual of a species (extinction) is larger, almost always much 
larger, than the sample area needed to encounter the first individual 
of a species, irrespective of species distribution and spatial scale. We 
illustrate these results with data from a global network of large, 
mapped forest plots and ranges of passerine bird species in the 
continental USA; and we show that overestimation can be greater 
than 160%. Although we conclude that extinctions caused by habitat 
loss require greater loss of habitat than previously thought, our 
results must not lead to complacency about extinction due to habitat 
loss, which is a real and growing threat. 

The Millennium Ecosystem Assessment’ predicts that near-term 
extinction rates could be as high as 1,000 to 10,000 times background 
rates (see also ref. 7). Most predictions of species extinction rates, 
including those in the Millennium Ecosystem Assessment, are inferred 
from applying the SAR to rates of habitat loss*'*. The wide discrep- 
ancy between the rates of species extinction predicted by this method 
and the extinction rates actually recorded, has fuelled a continuing 
debate about how to explain the discrepancy**'**°. The main issue 
is that, almost always, more species are left after a given loss of habitat 
than the number of species predicted to remain, based on the SAR. The 
most frequent interpretation is that the excess species are “committed 
to extinction’. The term “extinction debt’ was coined to refer to species’ 
populations that were no longer viable but were facing certain extinc- 
tion due to habitat destruction that had already occurred’*'’. The 
consensus on the most likely reason for the extinction debt is that 
there is a time lag for populations to go extinct after severe losses in 
population size°”’. 

Here we show that extinction rates estimated from the SAR are all 
overestimates. We define extinction rate as the fractional loss of species 
over a defined period accompanied by a given loss of habitat. These 
overestimates are due to the false assumption that the sampling problem 
for extinction is simply the reverse of the sampling problem for the SAR. 
The area that must be added to find the first individual ofa species is in 


general much smaller than the area that must be removed to eliminate 
the last individual of a species (Fig. 1). Therefore, on average, it takes a 
much greater loss of area to cause the extinction of a species than it 
takes to add the species on first encounter, except in the degenerate case 
of a species having a single individual. We show mathematically that 
this is a necessary result of fundamental sampling differences between 
the SAR and the endemics-area relationship (EAR). Only in a very 
special and biologically unrealistic case, when all species are randomly 
and independently distributed in space, is it possible to derive the EAR 
from the SAR. Although this special case almost never occurs in nature, 
we examine this simple case first to clarify the nature of the problem. 
Then we relax these assumptions and consider the general case of 
aggregated species distributions. 

The problem has gone unnoticed for so long because the traditional 
method for estimating extinction uses the power-law SAR, $= cA’, 
which has no sampling theory relating it to species distributions 
(Supplementary Information A). To develop a sampling theory, we 
must consider the spatial distribution of species explicitly (Supplemen- 
tary Information B and C). We derive the SAR and EAR from nearest- 
neighbour distances under two situations, random dispersion and 
clumped dispersion. We construct an SAR from the probability of 
encountering the first nearest neighbour of a species (a new species 
is added every time the sampling frame a encounters the first indi- 
vidual of the given species). In contrast, we construct the EAR from the 
probability of encountering the last neighbour of a species (a species is 
added only after all individuals are contained within frame a). We 
arrive at the species—area curve for randomly and independently dis- 
tributed species as (Supplementary Information B): 


Area of last 
encounter 


Area of first 
encounter 


Figure 1 | Sampling differences for SAR and EAR. Range distribution of a 
species (blue area), and an arbitrary starting sample point, indicated by +. 
Regardless of the starting location, a sampling frame of arbitrary shape (here 
circular) with an area ofa size sufficient to contact the species for the first time is 
always less than the sample area needed to encompass the entire range of the 
species. The SAR (species accumulation) is constructed from sample areas of 
first contact, and the EAR (species extinction) is constructed from areas of last 
contact. 
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s=s->(1-4)" 


and the endemics—area curve as: 


(2) 


where N; is the total abundance of species i and S is the total number of 
species in the region A. Equations (1) and (2), derived from nearest- 
neighbour distances, are identical to the classical random placement 
models”. 

Let the total area be A and let a sub-area a be lost. For randomly and 
independently distributed species, we can calculate the expected num- 
ber of species lost with a loss of area a from the SAR (equation (1)) as 
Stoss = S — S4 — q- This is identical to the EAR calculated directly from 


S N, 
equation (2): Stoss =S Si > (5) S . This proves that, for 


the special case of species distributed randomly in space, extinction 
rates estimated from the backward random placement SAR and from 
the forward random placement EAR are the same, and the SAR and 
EAR are mirror images (Fig. 2 and Supplementary Fig. 1). This case is 
true because, under random placement, the total area A is equal to the 
sum of the areas of encountering the first individual and the last 
individual of a species. From the probability models of the nearest- 
neighbour distance, the expected area needed to sample the first indi- 
vidual is a’ = A/(N + 1), and the expected area for the last individual is 
aN = NAI(N + 1) (Supplementary Information B). Thus ait+aN%=A. 
Note that a > a' is always true except when N = 1. 


LETTER 


This mirror-image relationship only holds for randomly distri- 
buted species, however. Almost all species in nature are clumped, not 
randomly distributed’®. For aggregated species, one can show that 
a' +a’ <A with a = a' remaining true (Supplementary Information 
C and Supplementary Fig. 2). This leads to S—S),_, #S'. The more 
spatially aggregated species distributions are, the stronger the inequality 
a’ = a' becomes. These results are completely general and explain the 
discrepancy between the backward SAR and forward EAR methods as 
well as why the backward SAR method systematically overestimates 
extinction rates. 

These results apply to sample areas on any spatial scale. We can 
assess the magnitude of overestimation by the backward SAR method 
precisely in cases where we know the species composition and spatial 
location of each individual of each species or spatial range of each 
species. To illustrate this, we use spatially explicit data from eight large 
stem-mapped plots from a global forest dynamics network. We also 
perform the analysis on biogeographical spatial scales for passerine 
species in the continental USA (see Methods). The results show that 
the classic power-law SAR model, S = cA’, and its corresponding EAR 
model (Supplementary Information A), 

2 = Stoss/S4 = 1- (1-a/A)* (3) 
are not mirror-image curves. In equation (3), S\,,; is the number of 
species lost (endemic) to destroyed sub-area a. Because of the differ- 
ence in sampling procedure of encountering species and losing species, 
the slopes z of the power-law model S = cA* and EAR (3) are not the 
same. The fit of the power-law SAR and EAR to species—area and 
endemics—area data respectively lead to two very different slopes 


34 
1,000 10 
3 
3 103 
rod 
2 600 i 
© 0 
8 10 
= 200 
2 
102 7 1071 
0 é 
pa TT ae a a a a ae 
5x10 10 5 01 05 20 10 50 
5x10°4 
wn 
@ 
is} 
oO 
a 
Pa 101 
5 ] 
na 
8 1 
5x10714 
€ i 
| 
Zz “2 
5x10} 
T T T T i T T if T 
0.1 05 20 10 50 
200 
n 
@ 
oO 
g 50 
n 
i) Changbai 
s 
® 
g 10 
= 5 
=) 
= 2 
: 1 
T T 


0 5 10 15 20 25 o1' 05 20 | 10 
Area (ha) Area (ha) 


Figure 2 | Species— and endemics-area curves for six of the nine data sets 
in Table 1. The second and fourth columns are the plots on a log-log scale. The 
upper and lower blue curves are the fits of the power-law SAR and EAR 
(equation (3)), respectively. The upper and lower red curves are the predictions 
of the random placement SAR (equation (1)) and EAR (equation (2)), 
respectively. Unlike for the other data sets, the red curve for US passerine data 
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Area 


(cell size 0.48° latitude X 0.48° longitude) is the fit of equation (3) because the 
abundances of the passerine species are not known (so equation (2) cannot be 
used). The cloud of points represent 100 repeated random samples of the SAR 
and EAR. The SAR and EAR curves for the Barro Colorado Island plot are 
shown in Supplementary Fig. 1. 
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Table 1 | Eight stem-mapped forest plots across the world and distributions of passerine birds in the continental USA 


Plot Forest type Size (ha) Number Number ZSAR ZEAR Biaso.s2 (%) Biasos (%) 
of trees of species 
Barro Colorado Island, Panama _ Lowland tropical forest 50 325,549 316 0.133 (0.00202) 0.0803 (0.000611) 65.61 64.38 
Yasuni, Ecuador Lowland tropical forest 50 307,279 1,128 0.126 (0.00473) 0.0623 (0.00189) 102.21 100.4 
Pasoh, Malaysia Lowland tropical forest 50 323,262 814 0.124 (0.00374) 0.0536 (0.00158) 131.30 129.02 
Korup, Cameroon Lowland tropical forest 50 328,973 496 0.179 (0.00369) 0.113 (0.00116) 58.38 56.92 
Dinghu, China Subtropical evergreen 20 71,617 210 0.274 (0.00180) 0.193 (0.000880) 41.94 40.34 
broad-leaved forest 
Fushan, Taiwan Subtropical evergreen 25 114,508 110 0.142 (0.00199) 0.0922 (0.000838) 53.99 52:92 
broad-leaved forest 
Tiantong, China Subtropical evergreen 20 94,603 152 0.200 (0.00214) 0.0994 (0.00175) O1.15 98.34 
broad-leaved forest 
Changbai, China Temperate forest 25 38,902 52 0.184 (0.00296) 0.0905 (0.00233) 03.27 100.62 
USA Passerine birds 14,904 - 279 0.187 (0.00101) 0.0766 (0.000516) 44.06 140.3 
(0.24° x 0.24°) 
USA Passerine birds 3,830 - 279 0.195 (0.00106) 0.0791 (0.000421) 47.39 143.39 
(0.48° x 0.48") 
The ‘bias’ is the overestimation calculated by comparing the extinction rates estimated from the Zsar values with those from the endemic Zear values: (Agar — Zcar)/Acar. We calculated percentage bias by assuming 
0.52% and 25% habitat loss!°, respectively. Equation (3) gives /,. To analyse passerine distributions, we divided the lower 48 states of the USA into a grid of 14,904 cells with cell size of 


0.24° latitude x 0.24° longitude and into 3,830 cells with cell size of 0.48° latitude x 0.48° longitude. 


(the SAR Zsapr versus the EAR Zgap) (Table 1). In some cases, Zsap can 
be more than double Zgar. This result is independent of the spatial 
scale of the data, as is evident for the passerine case shown in Table 1. 

This analysis demonstrates that the most widely used method of 
estimating species extinction rates due to habitat loss, the backward 
SAR calculation, is not correct. For non-randomly distributed species, 
the SAR and EAR are not mirror images, so that one cannot be used to 
infer the other. This result holds regardless of how well the power-law 
SAR fits species-area data (Supplementary Information D). Even for 
randomly distributed species, the backward power-law SAR model is 
still not appropriate for estimating extinction rates because in this case 
equation (1) is the only correct SAR, not the power-law SAR (Sup- 
plementary Fig. 2), and equation (2) is the only correct EAR model. 
These results show that the concept of an ‘extinction debt’ (that is, the 
extinctions lost to biotic relaxation due to habitat destruction) based 
on the backward SAR model is not conceptually sound. Note that these 
results say nothing about whether an extinction debt exists, only that 
such a debt as might exist is not appropriately measured by the back- 
ward SAR method. To model the process of biotic relaxation will 
require a dynamic theoretical framework different from the current 
static SAR model. Currently, no such theory is available. The EAR 
curve is consistent with the concept of ‘imminent extinction’, which 
states that predictions of near-term extinctions due to habitat loss 
should focus on species endemic to the area of destroyed habitat*””””*. 

Previous estimates of extremely high extinction rates, - for example, 
one species per hour to one species a day*, 33-50% of all species 
between the 1970s and 2000 (ref. 9), from half to several million species 
by 2000 (refs 10, 12) or 50% of species by 2000 (ref. 11) - have not been 
observed. There is also reason to question the recent estimates of extinc- 
tion rates made by the Millennium Ecosystem Assessment’ and those 
by Thomas et al.'’. In the latter case, the loss of habitat and the shift of 
species’ ranges are driven by climate change. However, the use of the 
flawed backward SAR in Thomas ef al. raises a legitimate question 
about the validity of their conclusion that 18-35% of species will be 
committed to extinction by 2050. We suggest that their estimated rates 
of extinction should be regarded as a high-end possibility rather than as 
supported by hard scientific evidence. 

By how much have we overestimated extinction rates? Precise 
answers to this question require information about the EAR curve, 
which is generally not known. However, we can make a first approxi- 
mation from the results shown in Table 1, for which we know the EAR 
curves in stem-mapped samples of forests and range distributions of 
passerines. We calculated the zpar and Zsar averaged over the data in 
Table 1, leading to Zar = 0.0940 and Zsar = 0.174. We then used two 
estimates of forest habitat loss, the annual deforestation rate of 
(a/A)100% = 0.52% for humid tropical forests”? and the estimated 
25% conversion of original forest habitat into agricultural land’. The 
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SAR backward method (equation (3)) overestimates extinction rates 
by 85.07% and 83.00% in these respective cases, compared with the 
rates estimated by the forward EAR method. Conservation biologists 
often use a z value of 0.25 in cases where z values are not available’*”. 
Using this value inflates extinction rate estimates much more, being 
165.85% and 160.10% for the two deforestation rates, respectively (see 
also Supplementary Fig. 1). 

Are better methods available for estimating extinction rates? Our 
results show that the random placement EAR curve describes the 
empirical EAR curves for the forest plots very well. This result is 
remarkable and provides a simple method for estimating extinction 
(Supplementary Information E). Note that the theoretical random 
placement EAR for each plot is not data-fitting but a genuine predic- 
tion from equation (2). 

These results might receive a mixed reaction from the conservation 
community. On the one hand, the good news is that all extinction rate 
estimates based on the backward SAR method are overestimates. 
Because it is derived from sample areas of first contact with each 
species, the backward SAR method makes the previously unrecognized 
assumption that any loss whatsoever of population due to habitat loss 
commits a species to extinction, which clearly is not true. On the other 
hand, there is likely to be concern that these results could jeopardize 
conservation efforts and be falsely construed in some quarters to imply 
that habitat loss is not a problem. Nothing could be further from the 
truth. There is no doubt whatsoever that the Millennium Ecosystem 
Assessment' has correctly identified habitat loss as the primary threat 
to conserving the Earth’s biodiversity, and the sixth mass extinction 
might already be upon us or imminent”. Our results do indicate, 
however, that the backward SAR is not the correct way to estimate 
the magnitude of the current extinction event. To help mitigate con- 
temporary extinctions and strengthen the science behind conservation 
planning, we need far better geographical data on endemism and 
species’ distributions to improve forecasts of extinction rates’. 
Improving geographical databases on the distribution of biodiversity 
on Earth should be an urgent international priority. 


METHODS SUMMARY 


We analysed data from eight 20-50 ha (1 hectare (ha) = 10‘ m’), stem-mapped 
plots of the Center for Tropical Forest Science global plot network to construct 
SARand EAR curves (http://www.ctfs.si.edu/). These data sets are suitable because 
(1) our analysis is independent of spatial scale, (2) they are among the few data sets 
in which individuals are mapped on a landscape scale and (3) the EAR curve, 
which must be known, cannot be calculated from SAR curves (see text). 

We obtained the SAR and EAR curves as follows: (1) grid the plot into cells of some 
minimum size (for example 5m X 5m); (2) count the number of species and the 
number of endemic species (species completely confined to the sample area) in each 
cell; (3) average the number of species per cell and the number of endemic species 
across all cells of a given size; and (4) construct species—area and endemics—area 
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curves by repeating steps 1-3, increasing cell size from 5m X 5m, 10m X5m, 10m 
X 10 m, etc. up to the entire plot. 

We estimated Zgqp by nonlinear fit of the power-law SAR model to the observed 
species-area data. We limited fitting to areas of at least 0.2 ha because the power- 
law model is not considered applicable at small spatial scales (including them 
inflates z values and worsens overestimation). We estimated Zgap by directly 
fitting equation (3) to the observed endemics-area data (see Table 1). 

We analysed SAR and EAR curves for 279 passerine species in the lower 48 
states of the USA using individual species’ range maps from Natureserve (http:// 
www.natureserve.org/getData/birdMaps.jsp). We divided the USA into grids at 
two respective cell sizes, 0.24° latitude X 0.24° longitude (14,904 cells) and 0.48° 
latitude X 0.48° longitude (3,830 cells), to confirm that our analysis is robust to 
scale change, as predicted by the analytical results. We computed SAR and EAR 
curves using presence-absence data following the above procedure. 
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Neuropsin cleaves EphB2 in the amygdala to control 


anxiety 
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A minority of individuals experiencing traumatic events develop 
anxiety disorders. The reason for the lack of correspondence 
between the prevalence of exposure to psychological trauma and 
the development of anxiety is unknown. Extracellular proteolysis 
contributes to fear-associated responses by facilitating neuronal 
plasticity at the neuron-matrix interface’ *. Here we show in mice 
that the serine protease neuropsin is critical for stress-related 
plasticity in the amygdala by regulating the dynamics of the 
EphB2-NMDaA-receptor interaction, the expression of Fkbp5 
and anxiety-like behaviour. Stress results in neuropsin-dependent 
cleavage of EphB2 in the amygdala causing dissociation of EphB2 
from the NR1 subunit of the NMDA receptor and promoting 
membrane turnover of EphB2 receptors. Dynamic EphB2-NR1 
interaction enhances NMDA receptor current, induces Fkbp5 gene 
expression and enhances behavioural signatures of anxiety. On 
stress, neuropsin-deficient mice do not show EphB2 cleavage and 
its dissociation from NR1 resulting in a static EphB2-NR1 inter- 
action, attenuated induction of the Fkbp5 gene and low anxiety. 
The behavioural response to stress can be restored by intra- 
amygdala injection of neuropsin into neuropsin-deficient mice 
and disrupted by the injection of either anti-EphB2 antibodies or 
silencing the Fkbp5 gene in the amygdala of wild-type mice. Our 
findings establish a novel neuronal pathway linking stress-induced 
proteolysis of EphB2 in the amygdala to anxiety. 

Fear helps organisms recognize, memorize and predict danger, 
thereby promoting their survival. However, severe stress can trigger 
maladaptive forms of neuronal remodelling leading to generalization 
of fear and high anxiety’. 

Traumatic events are memorized as a result of the capacity of syn- 
aptic connections and the surrounding matrix to undergo experience- 
dependent functional or morphological changes’®. Extracellular prote- 
ases are strategically poised to remodel the neuron-extracellular-matrix 
interface and facilitate fear and anxiety~*. Eph-receptor tyrosine 
kinases constitute an important group of molecules subject to modu- 
lation by extracellular proteases’. Although Ephs promote neuronal 
plasticity*”, their involvement in behavioural responses to environ- 
mental stimuli is not clear. 

Neuropsin is a serine protease uniquely positioned to facilitate 
stress-induced plasticity due to its high expression in the amygdala 
and hippocampus”. To investigate if neuropsin and Ephs co-localize 
we performed immunohistochemistry. Consistent with previous 
reports’®'' we found robust expression of both neuropsin and 
EphB2 in the amygdala (Fig. 1 and Supplementary Fig. 1) and the 
hippocampus (not shown). Double immunohistochemistry revealed 
high levels of neuropsin co-localizing with EphB2-rich clusters on 
amygdala neurons (Fig. 1a). 

To assess whether Ephs are modulated by neuropsin we treated SH- 
SY5Y cells with neuropsin and measured the levels of Eph receptors by 
western blotting. We found that neuropsin (but not other proteases; 


Supplementary Fig. 2) cleaved EphB2 (decrease by 41%, P< 0.001), 
whereas the levels of other Ephs or their ligand, ephrinB2, remained 
unchanged (Fig. 2a, b and Supplementary Fig. 3a). When we expressed 
either GFP-tagged EphB2, GFP-tagged EphA4 or unlinked GFP in SH- 
SY5Y cells (Supplementary Figs 4, 5) and treated them with neuropsin 
we saw a similar decrease in the EphB2-associated signal (Supplemen- 
tary Fig. 5; P< 0.05). 

When we used the above protocol to examine the composition of the 
SH-SY5Y or HEK293 cell culture medium after the application of neu- 
ropsin, we found a new ~70kDa extracellular fragment of EphB2 
released into the media (Fig. 2c), the size of which was consistent with 
neuropsin cleaving EphB2 close to the cell membrane (Supplementary 
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Figure 1 | Neuropsin and EphB2 co-localize in neurons of the basolateral 
complex of the amygdala. a, Double immunohistochemistry showed 
neuropsin (green) and EphB2 (red) co-localize in lateral amygdala neurons 
(arrows show EphB2-rich clusters at neuropsin detection sites). Cells were 
highlighted with TOTO-3 stain. b, Triple immunohistochemistry confirmed 
the presence of neuropsin/EphB2-rich clusters on the neuronal surface and a 
low degree of co-localization with cytoplasmic Fkbp51. c, d, Western blotting 
revealed amydala neuropsin upregulation after 6h restraint stress (Fi, 7) = 8.81; 
P<0.05). Digits inside columns indicate n. *P < 0.05. Results are shown as 
mean + s.e.m. 
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Figure 2 | Neuropsin cleaves EphB2 and regulates its expression both in 
vitro and in the amygdala after stress. a, b, EphB2-S (short splice variant) 
band density in SH-SY5Y cells decreased on 15 min of neuropsin treatment 
(F(3,18) = 11.24; P< 0.001). Neuropsin did not cleave other molecules of the 
same class (b and Supplementary Fig. 3a). c, Exposure of EphB2-GFP- 
transfected SH-SY5Y or HEK293 cells to neuropsin (15 or 45 min) resulted in 
the appearance of a ~70 kDa amino-terminal EphB2 fragment in the medium 
(Supplementary Fig. 6). WB, western blot. d, e, A twofold increase in 
membrane-associated EphB2 in neuropsin / ~ (Fe, 12) = 6.4; P< 0.05 versus 
non-stressed) but not wild-type mice was observed after stress (P < 0.05 versus 
stressed neuropsin ‘~ mice). f, RT-qPCR revealed a twofold upregulation of 
Ephb2 gene expression after 6 h stress (F(3, 23) = 13.48; P< 0.001), not observed 
in neuropsin-deficient animals (P < 0.001 versus stressed wild-type mice). 
EphB2-S and EphB2-L describe short and long splice variants, respectively. NP, 
neuropsin. Digits inside columns indicate n. *P < 0.05, **P < 0.01, 

* P< 0,001. Results are shown as mean + s.e.m. 


Fig. 6a, b). Next, we subjected wild-type and neuropsin ‘~ mice to 
restraint stress to activate the basolateral complex of the amygdala’. 
Neuropsin levels increased by 50% after stress and gradually normalized 
during recovery in this brain region (Fig. 1c, d; P< 0.05). Western 
blotting revealed a twofold increase in membrane-associated amygdala 
EphB2 levels after 15 min of restraint stress in neuropsin ‘~ mice 
(Fig. 2d, e and Supplementary Fig. 7; P< 0.05) indicative of new 
EphB2 receptors being incorporated into the membrane. This increase 
was not observed in wild-type mice, consistent with neuropsin- 
mediated EphB2 cleavage during stress. Cleavage of EphB2 in the 
amygdala of wild-type mice was followed by a twofold increase in the 
expression of the Ephb2 gene (Fig. 2f P<0.001). The cleavage was 
substrate-specific because stress did not alter the levels of either 
ephrinB2 (Supplementary Fig. 9a) or a presynaptic neuropsin substrate, 
NCAM-L] (also known as Licam)"* (Supplementary Fig. 3b, c). 

To determine the structural basis of neuropsin-specific EphB2 
cleavage we analysed the fibronectin type HI domain of EphB2 
(Supplementary Fig. 8) looking for similarities with the previously 
published neuropsin cleavage sequences’*. We found a critical amino 
acid pair, Gly-Arg, at position 517 of EphB2 but not EphB1, EphB6 
or EphA4. Consistent with our experimental findings cleavage of 
EphB2 at this site would result in the release of a ~70 kDa extracellular 
fragment. 
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The EphB2 receptors cluster and associate with NMDA receptors at 
excitatory synapses'*'’. Indeed, co-immunoprecipitation revealed 
NRI bound to EphB2 in the amygdala (Fig. 3a). Restraint stress 
reduced the amount of EphB2 associated with NRI at 15min by 
42% (Fig. 3a, b; P<0.05) whereas NRI1 levels were not altered 
(Supplementary Fig. 9b-e). Stress-induced decrease in the EphB2- 
NR1 association was not observed in neuropsin ‘~ mice but restored 
by intra-amygdala administration of neuropsin into these animals 
(Fig. 3a, b), consistent with neuropsin cleaving the extracellular por- 
tion of EphB2 during stress and triggering its dissociation from NR1. 
These results, together with stress-induced EphB2 membrane traffick- 
ing (Fig. 2d, e), indicate that neuropsin increases the dynamics of the 
EphB2-NR1 interaction after stress. 

Regulating the EphB2-NMDA-receptor interaction results in 
modulation of the expression of NMDA-receptor-dependent genes 
facilitating synaptic plasticity’. To examine if neuropsin-mediated 
regulation of the EphB2-NR1 assembly affects gene expression in 
the amygdala we used microarrays in neuropsin ‘~ and wild-type 
mice. We found 19 differentially expressed transcripts with a marked 
upregulation of the Fkbp5 gene (Fig. 3c, d and Supplementary Figs 10, 
11; P< 0.0005). This gene encodes the Fkbp51 protein, which regu- 
lates glucocorticoid receptor sensitivity. Fkbp5 has been implicated in 
the development of anxiety, depression and post-traumatic stress dis- 
order (PTSD)'**°. Quantitative polymerase chain reaction with 
reverse transcription (RT-qPCR) confirmed an increase in Fkbp5 gene 
expression in the amygdalae of stress-naive neuropsin ‘~ animals 
(Fig. 3e; P< 0.05). 

The extent of upregulation of Fkbp5 messenger RNA shortly after 
trauma correlates with the development of PTSD". If neuropsin regu- 
lates Fkbp5 gene expression then the magnitude of its stress-related 
regulation should be altered in neuropsin-deficient mice. When we 
analysed stress-induced Fkbp5 gene expression we found a 21-fold 
upregulation in wild-type amygdalae (Fig. 3e; P< 0.001) but an atte- 
nuated upregulation in neuropsin ‘~ mice. The increase in Fkbp5 
gene expression was accompanied by a twofold upregulation of 
Fkbp51 protein levels in wild-type mice but not neuropsin ‘~ mice 
(Fig. 3f, g; P<0.05). These results indicate that neuropsin is a key 
regulator of the Fkbp5 gene and protein expression. 

Neuropsin is an extracellular protease and thus unlikely to alter the 
expression of the Fkbp5 gene directly. Although the Fkbp5 gene can be 
regulated by glucocorticoids (Supplementary Fig. 12), the above dif- 
ferences in Fkbp5 expression after stress cannot be attributed to corti- 
costerone levels (Supplementary Fig. 13). Interference with EphB2 
signalling has recently been linked to the regulation of the Fkbp5 
gene”. Indeed, when we mimicked stress in vitro by adding corticos- 
terone into neuronal amygdala cultures, neuropsin-mediated upregu- 
lation of Fkbp5 was hindered by anti-EphB2 antibody (Fig. 3h; 
P<0.001) and imitated by NMDA receptor stimulation (Fig. 3i; 
P<0.05). 

To address the effect of neuropsin on NMDA receptors directly we 
measured the evoked NMDA/AMPA current ratio in principal neu- 
rons of the basal amygdala in wild-type and neuropsin ‘~ mice. We 
found that, unlike in the hippocampus’’, the NMDA current was 
markedly reduced by the deletion of the neuropsin gene, resulting in 
a ~50% drop in the NUDA/AMPA ratio (Fig. 4a-c; P< 0.01). 

We next asked if the neuropsin pathway affects neuronal plasticity 
in the amygdala. We induced early (E-LTP) or sustained (L-LTP) long- 
term potentiation in the amygdala lateral-basal pathway of wild-type 
and neuropsin ‘~ mice. Whereas basal synaptic responses were not 
altered (Supplementary Fig. 14a), E-LTP was impaired in neuropsin /~ 
mice (Fig. 4d-f and Supplementary Fig. 14b, c; P< 0.001 versus wild- 
type at 20 min post-tetanus). These changes temporally correlated with 
neuropsin-mediated cleavage of EphB2, its dynamic interaction with 
NRI and with the involvement of NMDA receptors in E-LTP in the 
lateral-basal pathway (Supplementary Fig. 15). 
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Figure 3 | Neuropsin regulates the dynamics of the EphB2-NR1 interaction 
and controls the expression of Fkbp5. a, b, EphB2 immunoprecipitation (IP; 
before and after 15 min of restraint stress) from amygdalae revealed 
dissociation of EphB2-NR1 complexes in wild-type (F(3, 19) = 4.2; P< 0.05) 
but not neuropsin ‘~ mice. EphB2-NR1 dissociation was restored in 
neuropsin ‘~ mice by intra-amygdala neuropsin injections (F3, 13) = 4.7 
P<0.05). ¢, Microarray analysis of wild-type and neuropsin “ amygdalae 
revealed differential expression of Fkbp5 (heatmap in c, Supplementary Fig. 10). 
d, e, Exon-specific Fkbp5 probes showed an upregulation of the whole 
transcript (d) confirmed by RT-qPCR (e; Fis, 12) = 72.15; P< 0.001). 
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after stress) rescued by intra-amygdala neuropsin injections (F(3, 14) = 9.2; 
P<0.01). f, g, Fkbp51 protein levels were upregulated in wild-type mice 

(Fe, 14) = 8.95; P< 0.001) but not in neuropsin /~ mice by stress. 

h, i, Neuropsin-mediated upregulation of Fkbp5 in amygdala neuronal cultures 
(Fos, 29) = 19.04; P< 0.0001) was blocked by anti-EphB2 antibody and mimicked 
by stimulation of NMDA receptors (i; P< 0.05). CORT, corticosterone. 
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Figure 4 | Neuropsin controls NMDA receptor 
current, E-LTP and stress-induced anxiety. 

a-c, Whole-cell recordings from basal nucleus 
neurons of neuropsin /~ mice demonstrated lower 
NMDA currents compared to wild-type mice. 
d-f, Induction of LTP in the lateral-basal pathway 
(d) using a strong (e) or weak (f) protocol revealed 
an impairment of E-LTP in neuropsin ‘~ mice. 
BLA, basal amygdala; LA, lateral amygdala. Stimul., 
stimulation; Record., recording. g, The elevated- 
plus maze test after acute or chronic restraint stress 
demonstrated a lack of anxiety in neuropsin ‘~ 
mice as indicated by the number of entries into 
open arms. h, i, General locomotor activity was 
similar in both genotypes. j, The behavioural 
phenotype was reversed by bilaterally injecting 
neuropsin back into the amygdala of neuropsin ~ 
mice. k, 1, Stress-induced anxiety in wild-type mice 
was disrupted by blocking EphB2 (k) or silencing 
the Fkbp5 gene (1) in the amygdala. Digits inside 
columns or near symbols indicate n. 6hS, 6 h stress, 
21dS, 21 days of daily restraint. *P < 0.05, 

**D <().01, ***P < 0.001. Results are shown as 
mean + s.e.m. 


= 


To examine if the neuropsin pathway alters behavioural signatures 
of stress we subjected wild-type and neuropsin ‘~ mice to acute or 
chronic stress and measured anxiety in the elevated-plus maze 
(Fig. 4g-i). We found that stress caused a decrease in the number of 
entries of wild-type mice into open arms, indicative of high anxiety 
levels’. In contrast, after stress, neuropsin ‘~ mice did not develop 
anxiety (Fig. 4g; P< 0.05). Closed-arm entries, the total number of 
entries (Fig. 4h, i and Supplementary Fig. 16a, b) as well as general 
locomotor activity measures (Supplementary Fig. 17d) were similar 
between the genotypes as previously reported”’. Furthermore, neurop- 
sin ‘~ mice demonstrated an anxiolytic phenotype in the open field 
test, confirming a general role of neuropsin in regulating anxiety 
(Supplementary Fig. 17a-c). Although this effect is consistent with 
functional deficits in NMDA receptor function (Fig. 4a-c) and 
E-LTP (Fig. 4d-f) observed in neuropsin / ~ mice, it cannot be 
excluded that additional mechanisms, such as abnormal dendritic 
plasticity, may contribute to the lack of anxiety observed in neuropsin /~ 
mice, particularly after long-lasting stress”. 

To examine whether the effect of neuropsin was acute and not asso- 
ciated with the lack of the protease during development we 
bilaterally injected neuropsin into the amygdalae of neuropsin ‘~ mice 
(Supplementary Fig. 18). The neuropsin injection restored stress- 
induced anxiety in these animals (Fig. 4j; P< 0.001). The development 
of anxiety was hindered by blocking EphB2 in the amygdala of wild- 
type mice (Fig. 4k; P< 0.001), consistent with neuropsin interacting 
with EphB2 to facilitate stress-induced behavioural changes. Similarly, 
stress-induced anxiety was blocked by silencing Fkbp5 gene expres- 
sion in this brain region (Fig. 41 and Supplementary Fig. 19), con- 
sistent with a downstream role of Fkbp5 in the neuropsin pathway. 

Our studies favour a model where, after stress, both corticosterone- 
induced and neuropsin-mediated components converge to modulate 
Fkbp5 gene expression and trigger anxiety (Supplementary Fig. 20). 
Neuropsin cleaves the extracellular portion of EphB2 and facilitates the 
dynamic interaction of EphB2 with the NR1 subunit of the NMDA 
receptor. The resulting enhancement of the NMDA current causes an 
upregulation of Fkbp5 and promotes the development of anxiety. This 
novel pathway, highlighting the ability of Eph and NMDA receptors to 
respond to activity-dependent signals from the extracellular milieu, 
opens new possibilities for the treatment of stress-associated disorders, 
including various forms of anxiety disorders. 


METHODS SUMMARY 


Restraint stress was performed by placing the mice in wire mesh restrainers while 
control mice were left undisturbed. Anxiety was measured using the elevated-plus 
maze by counting the number of entries to closed or open arms during 5 min. 
Intra-amygdala injections were performed through bilaterally implanted cannulae 
and were followed by restraint stress in plexiglass tubes. The Fkbp5 gene was 
silenced by intra-amygdala injection of lentiviral short hairpin RNA construct 
followed by behavioural assessment two weeks later. LTP was recorded from the 
lateral-basal pathway and whole-cell recordings made from basal amygdala neu- 
rons. Data were analysed by Student’s t-test or ANOVA followed by Tukey’s post- 
test. P values of less than 0.05 were considered significant. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Mice. Experiments were performed on three-month-old wild-type (C57BL/6]) or 
neuropsin ‘~ mice backcrossed to C57BL/6J for 12 generations. To generate 
neuropsin /~ mice, exons 1-3 of the neuropsin gene, including the protease active 
site, were replaced by a neomycin resistance cassette”. A lack of full-length 
neuropsin transcript and proteolytic activity in the brain of these mice was con- 
firmed by RT-PCR and amidolytic assay”*, respectively. Neuropsin ‘~ mice were 
genotyped as described’’. Mice were housed three to five per cage in a colony room 
with a 12h light/dark cycle (lights on at 07:00) with ad libitum access to commercial 
chow and tap water. The experiments were approved by the UK Home Office and 
the University of Leicester Ethical Committee. 

Restraint stress. C57BL/6] and neuropsin ‘~ mice were kept undisturbed for at 
least one week in their home cages and restraint stress was performed during the 
light period of the circadian cycle as described’. Control animals were left un- 
disturbed, and stressed animals were subjected to a single 5 min, 15 min or 6h 
restraint stress in a separate room. The mice were placed in their home cages in 
wire mesh restrainers secured at the head and tail ends with clips. 

Primers. The primers for Ephb2 (forward 5’-CTTCCTCATCGCTGTGGTC and 
reverse 5’-ATGTGTCCGCTGGTGTAGTG) and Fkbp5 (forward 5'-ATTTGAT 
TGCCGAGATGTG and reverse 5’-TCTTCACCAGGGCTTTGTC) were bought 
from Invitrogen. To quantify gene expression the target genes were compared 
against the actin gene as previously described”*. 

RNA extraction. Control and stressed mice were anaesthetized (intraperitoneal 
sodium pentobarbital 50mgkg ') and perfused transcardially (ice-cold PBS). 
Amygdalae were dissected from a coronal slice —0.58 to —2.3 mm relative to 
Bregma and stored in ‘RNA later’ (QIAgen) at 4°C. Alternatively, RNA was 
extracted from primary amygdala neuronal cultures using QIAzol lysis reagent 
(QIAgen) and Mini Spin Columns according to the manufacturers’ instructions 
(RNeasy Lipid tissue mini kit, QIAgen). Two micrograms of RNA was converted 
to cDNA using Superscript III (Invitrogen) and oligo(dT) primers according to 
manufacturer’s instructions. 

RT-qPCR reaction. Triplicate wells contained 20 jl of SYBR Green Master Mix 
(Applied Biosystems and BioRad), forward primer (250nM), reverse primer 
(250nM), cDNA (1 tl) and nuclease free water to a total of 40 ul. PCR was 
performed using Chromo4/PTC-200 thermal cycler (MJ Research) under the 
following conditions: (1) 95°C for 15 min; (2) 94°C for 15s; (3) 55°C for 30s; 
(4) 72°C for 30s; (5) steps 2-4 repeated 40 times. Control reactions were per- 
formed without DNA template and/or with unconverted RNA as the template. 
Western blotting, cell fractionation and immunoprecipitation. Mice were 
anaesthetized (intraperitoneal sodium pentobarbital 50mgkg~') and perfused 
transcardially (ice-cold PBS). The brains were removed and amygdalae dissected 
from a slice —0.58 to —2.3 mm relative to Bregma. Samples were homogenized 
in 0.1M Tris, 0.1% Triton X-100, pH 7.4, containing phosphatase inhibitors 
(10mM NaF, 1mM Na orthovanadate) and protease inhibitors (Complete, 
Roche). Protein concentration was adjusted (Bradford method; Pierce). For 
Fkbp51 levels samples were homogenized in RIPA buffer (150 mM NaCl, 1% NP- 
40, 0.5% sodium deoxycholate, 0.1% SDS, 50mM Tris, 10mM NaF, 1mM Na 
orthovanadate, complete Roche protease inhibitors, pH 8.0). Reduced (DTT) and 
denatured (100 °C for 5 min) samples (40 ig per lane) were subjected to SDS-PAGE 
electrophoresis and transferred onto nitrocellulose membrane, blocked (5% skim 
milk for 1 h at room temperature (25 °C)) and washed with TBS-T (3 X 5 min). The 
membranes were probed with the following primary antibodies overnight at 4 °C: 
goat anti-NCAM LI (SantaCruz Biotechnology; 1:300), goat anti-EphB2, anti- 
EphB6 and anti-ephrinB2 (R&D; 1:500, 1:500 and 1:300, respectively), mouse 
anti-EphA4 (Zymed; 1:1,000), rabbit anti-pan-cadherin (Abcam; 1:2,000), rabbit 
anti-p75NGEF (Chemicon, 1:1,000), rabbit anti-NR1 (Upstate, 1:250), rabbit anti- 
neuropsin (H. C. Castro), rabbit anti-Fkbp51 (Abcam; 1:250). The membranes were 
then washed in TBS-T (3 X 10 min) and incubated with a relevant HRP-conjugated 
secondary antibody as appropriate (Vector Labs, 1:1,000, 1h, room temperature). 
The signal was developed, after washing with TBS-T (6 X 10 min), using a Western 
Blot Luminol Reagent (Santa Cruz). To normalize the results the membranes were 
stripped, blocked, washed as above and re-blotted using mouse anti-f-actin antibody 
(Sigma, 1:2,500, 1h, room temperature). The membranes were then prepared and 
developed as above. To quantify the results the band intensities were measured using 
Scion Image software and normalized to the actin bands. 

When indicated, cellular fractions of the amygdala samples were separated 
using a cellular protein fractionation kit (PerkinElmer) as per the manufacturer's 
protocol, and analysed by western blotting. 

For immunoprecipitation, amygdala samples were homogenized as previously 
described”, pre-cleared using goat IgG (Sigma, 1 j1g) before incubation with goat 
anti-EphB2 antibody (R&D, 2 1g) for 1h (4 °C). The samples were then incubated 
with protein G-sepharose beads overnight before being washed with PBS and 
analysed by western blotting. 


Cell culture. SH-SY5Y cells at 80-90% confluence were washed with PBS three 
times before incubation with PBS, PBS and neuropsin (50nM; R&D) or PBS and 
tissue plasminogen activator (tPA) (Alteplase, Genentech; 1 jg ml ') with or 
without human plasminogen for 15 min, after which the dishes were placed on 
ice and protease inhibitors (Complete, Roche) were added. The cells were collected 
and homogenized in Tris 50mM pH 7.5, NaCl 150mM, EDTA 5mM, EGTA 
5 mM, Triton-100 1%, NP40 0.5%. The resulting protein sample was analysed by 
western blotting as described above. 

SH-SY5Y and HEK293 cells were transfected with mouse EphB2-GFP and 
incubated with PBS or PBS + neuropsin (300 nM) for 15 or 45 min. The samples 
were then treated as above for analysis. 

For imaging, SH-SY5Y cells were transfected with GFP, mouse EphB2-GFP or 
EphA4-GFP (gift from A. Kania) and loaded with the cell tracker (Invitrogen). 
Images were taken with Zeiss LSM5 Exciter before and after 15 min incubation 
with neuropsin (50nM, R&D), converted to greyscale and the intensity of the 
fluorescent signal was analysed using Scion Image. 

Primary neuronal cultures were prepared from P1 C57BL/6J wild-type mice. 

The amygdalae were dissected and placed into a Petri dish (9.1mM glucose, 
25 mM HEPES, 5 mM KCland 120 mM NaCl). Tissue was chopped and incubated 
in 10 ml of buffer containing 5 mg of protease (Type XIV; Sigma) and 5 mg of 
thermolysin (Type X; Sigma) at room temperature for 30-45 min. The digestion 
solution was replaced with 3 ml of HBSS (Gibco) plus 40 pgml~! DNase. The 
mixture was titurated, centrifuged and resuspended in the plating medium 
(Neurobasal A medium, 10% fetal bovine serum, 2mM Glutamax and 2% B-27 
supplement, 100,1gml~' streptomycin, 100Uml' penicillin (Invitrogen)), 
centrifuged and resuspended again. 20-30 ll droplets containing cells were added 
to the centre of poly-p-lysine (Sigma) coated coverslips and the plating medium was 
added one hour later. 5 uM cytosine B-D-arabinofuranoside (Ara-C; Sigma) was 
added to prevent proliferation of glial cells. Neurons were maintained in serum-free 
Neurobasal A medium at 37 °C in a humidified atmosphere of 5% CO./95% air. 
Half of the medium was replaced every 3-4 days. Cells were maintained for 11-16 
days in vitro and then treated with either vehicle, corticosterone (10 nM), neuropsin 
(50nM) or NMDA (100 iM, Sigma) + glycine (10 1M, Sigma). To block EphB2, 
neurons were treated with anti-EphB2 antibody (2 4g ml~'; R&D) 10 min before 
the experiment. 
Electrophysiology. For field recordings coronal slices of the amygdala (400 jum) 
were obtained from 8-12 weeks-old neuropsin /~ and wild-type mice. The animals 
were anaesthetized with ketamine/xylazine (2:1 ratio; 2.41 g ' ip.). Slices were 
prepared using a vibrating microtome (Campden Instruments; MA752) in ice-cold, 
low sodium ACSF (sucrose 249 mM, KC] 2.5 mM, NaH3PO, 1.25 mM, D-glucose 
10mM, NaHCO; 26 mM, CaCl, 0.1 mM, MgSO, 2.9 mM, ascorbic acid 0.5 mM, 
bubbled with 95% O2/5% CO, mixture, pH 7.3). Slices were placed in a holding 
chamber for 30 min at 35°C and then for at least 2.5h (30 min for whole-cell 
recordings) at room temperature (25°C) in ACSF (NaCl 124mM, KCl 5mM, 
NaH>PO, 1.25mM, D-glucose 10 mM, NaHCO; 26 mM, CaCl 2.4mM, MgSO4 
1.3 mM). All the experiments were performed at room temperature. 

Extracellular recording were made with a bipolar tungsten electrode (WPI). For 
recordings, glass microelectrodes (1-2 MQ) filled with ACSF were used. To record 
field potentials in the lateral-basal amygdala pathway the stimulating electrode 
was positioned in the lateral amygdaloid nucleus close to the external capsule and 
the recording electrode in the basal nucleus**. The stimulus intensity was adjusted 
to evoke a field potential (FP) of 60-70% (0.2 ms pulse duration) of the maximal 
amplitude. The amygdala was stimulated every 30s in order to record a stable 
baseline for at least 15 min. Several responses were averaged and a template was 
created. Only the responses matching the template were analysed. E-LTP was 
evoked by a single tetanic stimulation (100 Hz, 1s). L-LTP were elicited by two 
tetanic stimulations (100 Hz, 1 s, 10s interval) repeated 4 times in 3 min intervals 
with the same intensity and pulse duration as the test stimuli as described”. The 
recordings were amplified (Multiclamp 700b, Axon Instruments), filtered 
(10 kHz) and digitized at 50 kHz (Digidata 1440A, Axon Instruments). pClamp 
10 (Axon Instruments) and Origin 7 (Microcal) software were routinely used 
during data acquisition and analysis. 

For whole-cell recordings, coronal slices (300 1m) were obtained from 3-4- 
week-old mice. The animals were anaesthetized with hypnorm/midazolam (1:1; 
8 lg ' body weight, intraperitoneally). Recordings were made from somata of 
principal neurons of the basal nucleus of the amygdala. Principal neurons and 
interneurons were distinguished by their morphological and electrophysiological 
properties*’. After whole-cell configuration the series resistance was regularly 
monitored and a maximum of 10-15 MQ tolerated. AMPA and NMDA currents 
were recorded by clamping the membrane potential of the cell at —-70 mV and 
+40 mV respectively (average of 5 traces each). The slice was subsequently per- 
fused with AP-5 in order to isolate the NMDA component at +40 mV (subtrac- 
tion of traces before and after perfusion with AP-5). The NUDA/AMPA ratio was 
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obtained by measuring the peak of the AMPA current at —70 mV in the presence 
of AP-5 and the peak of NMDA current at +40 mV. At the end of the experiments 
the currents were blocked by CNQX (30M in DMSO) and AP-5 (50M in 
DMSO). All drugs were bath applied (perfusion rate 1.5 ml min’). The recording 
electrodes were borosilicate glass pipettes (2-4 MQ). The pipettes were filled with 
the following solution: Cs-methyl sulphonate 130 mM, KCl 8 mM, EGTA 0.5 mM, 
HEPES 10 mM, glucose 5 mM, QX314 5 mM. ACSF composition: NaCl 124 mM, 
KC15 mM, NaH3PO, 1.25 mM, p-glucose 10 mM, NaHCO; 26 mM, CaCl, 2 mM, 
MgSO, 1 mM. All the experiments were performed at 25 °C. Data were recorded 
with a Multiclamp 700B amplifier, filtered at 10 kHz and digitized at 50 kHz 
(Digidata 1440A, Axon instruments). pClamp 10 (Axon Instruments) and 
Origin 7 (Microcal) software was routinely used during data acquisition and 
analysis. 

Analysis of the Fkbp5 promoter. The identification of over-represented tran- 
scription factor binding sites (TFBSs) in the promoter region of Fkbp5 was per- 
formed using the cREMaG database (http://cremag.org). The promoter region of 
Fkbp5 was defined as an evolutionarily conserved (on the basis of the alignment of 
mouse and human genes) sequence between 10,000 bp upstream and 5,000 bp 
downstream of the transcription start site (TSS). The parameters of 65% conser- 
vation threshold and maximum number of top 10,000 conserved TFBSs in coding 
and non-coding regions were used. The obtained results were compared to con- 
served promoter background. Over-representation was measured as the number 
of identified TFBSs compared to putative number obtained by chance (P < 0.01). 
Microarray study. Amygdalae were isolated from wild-type (m = 15) and neu- 
ropsin ‘~ (n = 15) mice using a dissecting microscope in ice-cold ACSF (glucose 
25mM, NaCl 115mM, NaH,PO,4°H,O0 1.2mM, KCl 3.3mM, CaCl,, 2mM, 
MgsO,., 1mM, NaHCO; 25.5 mM, pH 7.4 and stored at —20°C in RNAlater 
solution (Qiagen). RNA was extracted using RNeasy Lipid Tissue Mini Kit 
(Qiagen), the ribosomal fraction of RNA reduced with RiboMinus Kit 
(Invitrogen) and the RNA integrity verified by electrophoresis using Agilent 
Bioanalyser 2100 (Agilent Technologies). RNA pulled from three mice was 
reverse-transcribed and hybridized with GeneChip Mouse Exon 1.0 ST Array 
(Affymetrix; 5 arrays per genotype). 

Microarray data were initially processed using GeneChip Operating Software. 

DTT data were transferred by Transfer Tool software (Affymetrix). Chip quality 
and raw microarray data pre-processing were performed according to the 
Affymetrix guidelines using Expression Console software (Affymetrix). After back- 
ground subtraction, the data were processed using the RMA method and quantile 
normalization. The obtained results were taken as the measure of mRNA abundance 
derived from the level of gene expression. Significance levels (P values) of differences 
in mRNA abundance between the wild-type and neuropsin ‘~ animals were cal- 
culated for each probe set using the Student’s t-test. The P values for all exons of each 
particular gene were multiplied to establish gene P value. The threshold of P < 0.05 
for each gene was computed using a permutation test followed by Bonferroni 
correction for multiple testing. All the statistical analyses were done in R software 
version 2.8.1 (http://www.r-project.org). Sources of variation were analysed by a 
three-way ANOVA using Partek Genomic Suite. 
Immunohistochemistry. Mice were anaesthetized (intraperitoneal sodium 
pentobarbital 50 mg kg ') and transcardially perfused (ice-cold PBS containing 
protease inhibitors (Complete, Roche) followed by ice-cold 4% paraformaldehyde). 
The brains were dissected and fixed in 4% paraformaldehyde in PBS overnight at 
4 °C. Seventy-micrometre-thick coronal slices were collected on a vibrating micro- 
tome and stored at 4 °C in PBS containing 0.002% sodium azide (Sigma). Slices were 
preincubated in PBS-T (PBS solution 0.5% bovine serum albumin, 0.02% Triton 
X-100 and blocking sera at 1:500) for 5 h at room temperature, incubated with goat 
anti-EphB2 (1:300, R&D) or rabbit anti-neuropsin (1:200, H. Castro; the antibody 
was preabsorbed on acetone powder prepared from neuropsin ‘~ brain for 1h at 
room temperature before use) antibodies, along with mouse anti-NeuN (1:200, 
Chemicon) and chicken anti-GFAP (1:1,000, Dako) overnight at 4°C in PBS-T. 
Next, the slices were washed for 8-10 h with PBS-T and incubated overnight with 
compatible FITC, Alexa Fluor 488, Alexa Fluor 546 or Alexa Fluor 647 secondary 
antibodies (1:500, Molecular Probes) in the same buffer. Control sections were 
processed with the primary antibodies omitted. For double Ephb2/neuropsin co- 
labelling sections were incubated in PBS-T containing anti-goat Alexa Fluor 546 as 
well as anti-rabbit Alexa Fluor 488 for detection of the above primary antibodies. 
TOTO-3 iodide (1nM, Molecular Probes) was added to the secondary antibody 
mixture. For the triple Ephb2/neuropsin/Fkbp5 labelling rat anti-Fkbp51 (R&D, 
1:500) was additionally used along with a compatible Alexa Fluor 647 secondary 
antibody. Sections were then washed in PBS-T for 5 h, mounted on glass slides using 
Vectamount medium (Vector Laboratories), and photographed using Zeiss LSM 5 
Exciter confocal microscope. 
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Corticosterone levels. Mice were subjected to restraint stress of various durations 
with or without recovery and trunk blood was collected to measure corticosterone 
levels in the plasma by EIA according to the manufacturer’s instructions (Cayman 
Chemicals, Cat No. 500651). 

Elevated-plus maze. The elevated-plus maze test was performed as previously 
described*. The apparatus consisted of four non-transparent white Plexiglas arms: 
two enclosed arms (50 X 10 X 30 cm) that formed a cross shape with the two open 
arms (50 X 10 cm) opposite each other. The maze was 55 cm above the floor and 
dimly illuminated. Wild-type and neuropsin /~ mice were tested 12h after the 
restraint stress. Mice were placed individually on the central platform, facing an 
open arm, and allowed to explore the apparatus for 5 min. Behaviour was recorded 
by an overhead camera. The number of entries of the animal from the central 
platform to closed or open arms was counted. The maze was cleaned with 70% 
alcohol after each session to avoid any odorant cues. 

Open field. Mice were placed ina 50 X 50 X 50 cm plexiglas box and were left free to 
move during 10 min. The box was cleaned with 70% alcohol after each session to 
avoid any odorant cues. An overhead camera placed above the box recorded the 
session. Locomotor parameters were analysed with the ANY-MAZE software 
(Stoelting). 

Stereotaxic injections. Mice were intraperitoneally anaesthetized with ketamine/ 
xylazine (100 and 10 mgkg ', respectively), placed in a stereotaxic apparatus and 
bilaterally implanted with stainless steel guide cannulae (26 gauge; Plastics One, 
Roanoke, VA) aimed above the basolateral complex of the amygdala (1.5 mm 
posterior to Bregma, 3.5 lateral and 4.0 ventral). The cannulae were secured in 
place with dental cement. Dummy cannulae were inserted into all implanted 
cannulae to maintain patency. After one week dummy cannulae were replaced 
with the injection cannulae (projecting 0.75 mm from the tip of the guide cannulae 
to reach the basolateral complex of the amygdala) and the mice were injected with 
either anti-EphB2 antibody (R&D, 1 ul, 2ug ml), control IgG, recombinant 
neuropsin (R&D, 1 jl, 50nM) or its vehicle followed by 6h restraint stress in 
transparent plexiglass tubes. After the experiment a small volume of bromophenol 
blue was injected to visualize the injection site, the brains were sectioned and the 
cannulae placement was determined histologically. 

Fkbp5 gene silencing. To silence the Fkbp5 gene in the amygdalae we used 
SMARTvector 2.0 lentiviral shRNA technology (Dharmacon) using a human 
cytomegalovirus (hCMV) promoter and a turboGFP reporter gene. Three differ- 
ent targeting constructs were tested. First, 0.3 tl of the lentivirus was injected at a 
point 1.7 mm posterior to Bregma, 3.5 mm lateral from the midline and 4.4mm 
ventral at 200 nl min“ ' using the Nanofil syringe with a 33-gauge needle through 
an UMP-3.1 micropump (all from World Precision Instruments) mounted on 
Stoelting stereotaxic frame. After 5 min the needle was lowered to 5 mm ventral 
and 0.3 il of the virus injected. The needle remained in place for another 5 min to 
prevent the backflow, slowly removed and the skin closed with Vetbond (3M). 
After two-weeks recovery the amygdalae were dissected to determine the knock- 
down efficiencies in vivo as compared to the non-targeting construct 
(IGGTTTACATGTIGTGTGA; 2.66 X 10°TUml”') or uninjected mouse 
amygdalae by RT-qPCR and western blotting as described above. The most effi- 
cient construct (~60% mRNA and protein knockdown efficiency; targeting 
sequence ATGCTGAGCTTATGTACGA; 3.02 x 10°TU ml’) was used to 
silence the Fkbp5 gene in all subsequent behavioural experiments. The restraint 
stress and the elevated-plus maze were performed two weeks after the intra- 
amygdala lentivirus injection as described above. The region specificity of lenti- 
viral injections was verified histologically by direct observation of the turboGFP 
fluorescence on consecutive coronal sections spanning the amygdala, using a Zeiss 
LSMS5 Exciter confocal microscope. 

Statistics. Student’s t-test (when two groups were compared) or analysis of vari- 
ance (ANOVA) followed by Tukey’s post-test were used as appropriate. P values of 
less than 0.05 were considered significant. The ANOVA P values are reported in 
the text, the results of the post-test are indicated by asterisks on graphs. 
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Stem-cell-triggered immunity through CLV3p-FLS2 


signalling 


Horim Lee', Ok-Kyong Chah' & Jen Sheen! 


Stem cells in the shoot apical meristem (SAM) of plants are the self- 
renewable reservoir for leaf, stem and flower organogenesis’. In 
nature, disease-free plants can be regenerated from SAM despite 
infections elsewhere, which underlies a horticultural practice for 
decades*. However, the molecular basis of the SAM immunity 
remains unclear. Here we show that the CLAVATA3 peptide 
(CLV3p), expressed and secreted from stem cells and functioning 
as a key regulator of stem-cell homeostasis in the SAM of 
Arabidopsis'*, can trigger immune signalling and pathogen res- 
istance via the flagellin receptor kinase FLS2 (refs 5, 6). CLV3p- 
FLS2 signalling acts independently from the stem-cell signalling 
pathway mediated through CLV1 and CLV2 receptors’”*, and is 
uncoupled from FLS2-mediated growth suppression®*®. Endogenous 
CLV3p perception in the SAM by a pattern recognition receptor for 
bacterial flagellin, FLS2, breaks the previously defined self and non- 
self discrimination in innate immunity*’. The dual perception of 
CLV3p illustrates co-evolution of plant peptide and receptor kinase 
signalling for both development and immunity. The enhanced 
immunity in SAM or germ lines may represent a common strategy 
towards immortal fate in plants and animals’”*. 

In both plants and animals, innate immunity is triggered through 
pattern recognition receptors (PRRs) in response to microbe-associated 
molecular patterns (MAMPs)°*’ to provide the first line of inducible 
defence. Plant receptor kinases represent the main functions of known 
plasma membrane PRRs for MAMP recognition to distinguish non-self 
from self. FLS2 is the first characterized plant leucine-rich-repeat 
(LRR) receptor kinase that perceives bacterial flagellin and launches 
convergent downstream signalling and defence pathways for poten- 
tially broad-spectrum pathogen resistance**’"’. The perception of 
bacterial flagellin is conserved in seed plants, and functional FLS2 
orthologues are found from A. thaliana to rice’®. As FLS2 is expressed 
throughout the whole plant including the SAM (Supplementary Fig. 
2)°¥, flagellin-FLS2 signalling could provide immune protection in 
different parts of the plant body after infections. 

While developing a plant expression system to screen for peptide- 
mediated receptor-like kinase (RLK) signalling, we observed that 
the endogenously modified 12-amino-acid CLV3p (Supplemen- 
tary Table 1)’ triggered similar responses as flg22 (the conserved 
22-amino-acid peptide of bacterial flagellin) in mesophyll proto- 
plasts*®°"*""°, This finding was unexpected because CLV3p is normally 
expressed, secreted and processed by the stem cells to control SAM 
maintenance via CLV1 and CLV2 receptors’***"”, Flg22 and CLV3p, 
but not ACLV3p lacking the last His residue, activated similar mito- 
gen-activated protein kinase (MAPK) activities detected by the in-gel 
kinase assay (Fig. 1a and Supplementary Fig. 3)°°"*"'°. Highly purified 
CLV3p synthesized by different sources displayed the same activities, 
ruling out the contamination possibility. We sought to identify the 
CLV3p receptor in leaf cells by examining MAPK activation in various 
receptor mutants. Neither the dominant clv1-1 mutant (Fig. la) nor 
the clv2-1 mutant (Supplementary Fig. 4)'**!7-!° affected CLV3p- 
triggered MAPK activation. Surprisingly, two independent fls2 mutant 


alleles of Landberg erecta (Ler) and Columbia (Col-0)°*, but not the 
efr-1 (the bacterial elongation factor EF-Tu receptor EFR) mutant”, 
failed to support the activation of MAPKs by both flg22 and CLV3p 
(Fig. la and Supplementary Fig. 3). Complementation with the wild- 
type FLS2 gene (Fig. 1b), but not CLV1 (data not shown) in the fls2 
mutant, confirmed that FLS2 could recognize both flg22 and CLV3p to 
mediate MAPK signalling. Importantly, CLV3p and flg22 activated 
similar early marker genes, including FRK1, WRKY29 and WRKY30 
(Fig. 1c), 

FLS2 signalling requires the recruitment of the RLK BAK] and the 
interaction between FLS2 and BAKI represents the earliest event 
(within a minute) triggered by flg22 binding to FLS2 (refs 6, 14-16). 
Like flg22, CLV3p also induced the immediate interaction between 
FLS2 and BAK] detected by reciprocal co-immunoprecipitation (Sup- 
plementary Fig. 5a, b)'*""°. Consistently, CLV3p signalling monitored 
by MAPK activation was greatly diminished in the bak1-4 mutant 
(Supplementary Fig. 6)'*’*. Notably, CLV3p pre-treatment could 
confer enhanced resistance to the pathogenic bacteria Pseudomonas 
syringae pv. tomato DC3000 in an FLS2-dependent manner (Fig. 1d)'°™*. 
These comprehensive analyses strongly support our new finding that 
the conserved flagellin receptor FLS2 can recognize the stem-cell pep- 
tide CLV3p and trigger convergent innate immune signalling*”"*. 

Peptide titration experiments using the FLS2 and BAK1 co- 
immunoprecipitation assay showed that 1 nM flg22 was as potent as 
1uM CLV3p (Fig. 2a and Supplementary Fig. 5c) required for SAM 
suppression'*”!. Intriguingly, flg22 and CLV3p peptides supported 
similar primary gene activation but distinct long-term growth effects 


a 
= Csr = ¢ aC F = G- AC F De ee 
= = 2 => ? = 
Ler clv1-1 fls2-24 Control FLS2-HA 
fls2 
c d Control a1 uM CLV3p 
= 810 uMCLV3p 0100 nM flg22 
26 
< .£ a 
g 33 
sg Eg? 
a ead 7 
3 ee 
es gs 3 . 
faa} Oo 
D 2 
= 100nM 1uM = Ler fls2-24 
flg22 CLV3p 


Figure 1 | CLV3p and flg22 activate similar downstream responses through 
FLS2. a, MAPK activity analysis. Ler, clvl-1 and fls2-24 protoplasts were 
treated with 1 uM CLV3p (C), 1 #M ACLV3p (AC) or 100 nM flg22 (F) for 
10 min. b, FLS2-HA complements fls2. c, CLV3p triggers flg22 marker gene 
activation. Quantification by qRT-PCR; peptide treatment for 1h. Error bars 
indicate s.d. (n = 3). d, CLV3p enhances resistance to P. syringae pv. tomato 
DC3000. Error bars indicate s.d. (n = 3). c.f.u., colony-forming units. 
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Figure 2 | CLV3p and flg22 share similar perception through FLS2. 

a, CLE40p fails to induce FLS2-BAK1 interaction. Treatments: flg22 (1, 

10 nM), 1 pM CLV3p or 1 uM CLE40p for 10 min. b, Flg22(A2) blocks CLV3p 
and flg22 signalling. Flg22(A2) (50 j1M). c, LRR mutations of FLS2 eliminate 
flg22 and CLV3p perceptions in fls2. Top, in-gel kinase assay; bottom, protein 
expression. d, '°I-Tyr-CLV3p specifically binds to FLS2. Nonspecific binding 
(20 uM unlabelled Tyr-CLV3p). Error bars indicate s.d. (n = 3). c.p.m., counts 
per min. e, Saturation binding curve and Scatchard plot. Error bars indicate s.d. 
(n = 3). f, Binding competition analysis with peptides. Error bars indicate s.d. 
(n = 3). Nonspecific binding (10 LM unlabelled Tyr-CLV3p). 


in seedling assays (Figs 3 and 4 and Supplementary Fig. 7). Flg22 is a 
MAMP from non-self invaders and is detected by a very sensitive 
perception system” to induce innate immunity in a timely manner 
after infection. CLV3p, on the other hand, is an endogenous plant 
signal naturally secreted in the stem-cell zone to activate constitutive 
innate immunity via FLS2 in the SAM, which might provide a type of 
‘vaccination’ before any infections to elicit sufficient immune protec- 
tion without severe growth penalty caused by MAMPs (Fig. 3a-e)?°°°. 

CLV3 belongs to the CLV3/ERS-related (CLE) gene family, which is 
conserved in diverse plant species*'”"'’. The Arabidopsis genome has 
32 CLE genes*’”"”. CLV3p and many other CLE peptides share similar 
root growth inhibition activity through the CLV2-CRN (CORYNE)/ 
SOL2 (SUPPRESSOR OF OVEREXPRESSION OF LLP1-2) receptor 
complex, revealing a high degree of redundancy in peptide signalling’*”. 
To assess the specificity of CLV3p-FLS2 signalling, we examined the 
activity of synthetic peptide CLE40p, which belongs to the same CLE 
subgroup as CLV3p (refs 4, 17, 18, 21). CLE40p triggered the growth 
arrest of SAM, repression of WUSCHEL (WUS) detected by 
pWuUS::GUS (ref. 23), and root growth inhibition (Supplementary Fig. 
8a, b) as previously reported for CLV3p and CLE19p (refs 4, 17, 18, 21). 
WUS encodes a homeodomain transcription factor that has a central 
role in Arabidopsis SAM maintenance’*. However, CLE40p did not 
promote FLS2-BAK1 interaction or flg22 early marker gene activation 
(Fig. 2a and Supplementary Fig. 8c). CLV3p-FLS2 signalling might be 
unique among CLE peptides. 

To determine whether flg22- and CLV3p-triggered FLS2 signalling 
through the same or different extracellular sites, we carried out competi- 
tion experiments using a well-established antagonist peptide lacking only 
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Figure 3 | CLV3p-mediated SAM arrest and immune signalling are 
uncoupled from flg22-triggered growth suppression. a—e, Growth inhibition 
analysis. Treatment with flg22 (1, 10 nM) or CLV3p (1 |1M) in Col-0 (a), fls2 
(b), Ler (c), fls2-24 (d) and clv2-1 (e) seedlings. Scale bars, 1 cm. f-i, CLV3p 
suppresses the SAM in Col-0 and fls2 plants. Red asterisks indicate SAM region. 
Scale bars, 50 jum (f-i). j-m, CLV3p represses pWUS::GUS expression. 
Treatment was without (j) or with 1 uM CLV3p (k), 1 uM ACLV3p (I) or 1 uM 
flg22 (m). Red arrows indicate SAM region. Scale bars, 1 mm. 


the last two amino acids at the carboxy terminus of flg22 (flg22(A2))™*. 
On the basis of the CLV3p- or flg22-triggered FLS2-BAK1 interaction 
assay by co-immunoprecipitation, we showed that flg22(A2) effectively 
competed with and blocked flg22 and CLV3p signalling (Fig. 2b). 
Moreover, two LRR mutants, one in the 10th LRR domain (fls2-24) 
and the other in the 23rd LRR domain (the LRR23b mutant) of FLS2, 
failed to activate flg22 and CLV3p signalling detected by MAPK activa- 
tion despite normal FLS2 protein levels (Fig. 2c)”’*'°**7°. These results 
indicate that flg22 and CLV3p probably share binding and activation 
sites in the extracellular LRR domain of FLS2. 

On the basis of the established studies on the flg22-FLS2 and 
CLV3p-CLV1 interactions’’”*, we developed a cell-based assay to 
show that '”I-Tyr-CLV3p interacted directly with FLS2 expressed 
in the null fls2 mutant protoplasts. Importantly, only the wild-type 
FLS2 protein but not the fls2-24 and LRR23b mutant protein or 
the brassinosteroid receptor kinase (BRI1) showed specific binding 
to '*°I-Tyr-CLV3p (Fig. 2d). The saturation binding curve and 
Scatchard plot were generated using the specific binding assay. The 
estimated dissociation constant (Kg) for FLS2 and ?S1-Tyr-CLV3p 
interaction was 34.7nM (Fig. 2e), which was close to the Kg for 
CLV1 and CLV3p interaction”*. The specific binding could be com- 
peted by the unlabelled Tyr-CLV3p, CLV3p, flg22, flg22(A2), but not 
CLE40 (Fig. 2f and Supplementary Fig. 9). Tyr-CLV3p and CLV3p 
displayed identical effectiveness in competing with '”°I-Tyr-CLV3p 
for binding to FLS2 (Supplementary Fig. 9), immune marker gene 
activation and root inhibition (data not shown). '*°I-Tyr-CLV3p bind- 
ing to FLS2 shared characteristics of CLV3p binding to CLV1 (ref. 28). 
Although the estimated K, for '”°I-Tyr-flg22 binding to FLS2 is lower” 
and FLS2 responses to flg22 are more sensitive (Figs 1c, d and 2a, and 
Supplementary Fig. 5a, c), flg22 did not compete more effectively than 
Tyr-CLV3p and CLV3p for '”°I-Tyr-CLV3p binding to FLS2. Because 
the sequences of CLV3p and flg22 do not share any overt similarity and 
ACLV3p was ineffective in activating or blocking FLS2 signalling 


19 MAY 2011 | VOL 473 | NATURE | 377 


©2011 Macmillan Publishers Limited. All rights reserved 


LETTER 


7 Ler fls2-24 clv1-1 


150 
MYB51 


AtPEP3 
100 
i ‘ 50 


0 
Ler fls2-24 clv1-1 clv2-1 


Ler fls2-24 clv1-1 clv2-1 clv2-1 


a Control aflg22 a CLV3p © 

= 

807 15 
= gol ¢ FRKI WRKY29 
2 10 
8 404 
S 204 5 
5 ol —— 0 coe ot 
uw Ler fls2-24 clv1-1 clv2-1 Ler fls2-24 clv1-1 clv2-1 
8200) nxvso 90) yes 
§ 1504 60 
§ 100, 
5 504 30 
oS 0 
= 
2 
§ 
2 
5 
oO 
z 
[e} 
LL. 


7 Ler fls2-24 clv1-1 clv2-1 


b 5 OControl s#CLV3p 
a 1.6 wus ———__———— 
@ 1.2 
3 08 
5 04 
5 0 
i Ler clv1-1_ clv2-1__fls2-24 


Figure 4 | CLV3p-FLS2 signalling enhances innate immunity for SAM 
protection. a, Flg22 marker gene activation by CLV3p through FLS2 in the 
SAM. b, WUS repression by CLV3p was mediated via CLV1 or CLV2 but not 
FLS2. The SAM tissues were analysed by qRT-PCR after treatment with 1 nM 
flg22 or 14M CLV3p for 1h. Error bars indicate s.d. (n = 3). ¢, Infection and 


(Fig. 1a, data not shown), it was surprising to discover that they could 
still compete for some potentially shared binding sites on the LRR of 
FLS2 (Fig. 2b, c, f). The precise location of flg22 and CLV3p binding to 
FLS2 awaits future co-crystallographic analysis or mass spectrometry 
studies on cross-linked ligand-receptor complexes. 

In previous studies, innate immune responses and growth inhibi- 
tion are always tightly linked®*’°°. We investigated the effects of flg22 
and CLV3p on seedling growth’*'*”°. Surprisingly, CLV3p did not 
inhibit shoot growth but caused stronger root growth arrest in Col-0 
and Ler wild-type seedlings (Fig. 3a, c and Supplementary Figs 7a and 
8b). Whereas seedling growth inhibition in both shoots and roots by 
flg22 was eliminated in two independent fls2 mutants, the stronger 
root growth arrest by CLV3p was retained (Fig. 3b, d). Although 
CLV3p activated, via FLS2, a spectrum of innate immune responses 
similar to those via flg22—FLS2 signalling in mesophyll cells, seedlings 
and the SAM (Figs 1, 2 and 4 and Supplementary Figs 3, 5 and 8), 
CLV3p did not stimulate the typical flg22-FLS2-mediated growth 
suppression in whole seedlings (Fig. 3a—e). Similar to other synthetic 
CLE peptides, CLV3p-triggered root growth arrest was abolished in 
the clv2-1 mutant (Fig. 3c-e and Supplementary Fig. 7)'*'’. Thus, 
CLV3p-FLS2 signalling activated only immune responses but not 
the general growth inhibition. Significantly, CLV3p was very active 
in triggering the typical SAM arrest in null fls2 seedlings (Fig. 3f- 
j)'?41817°192328 The SAM growth arrest was correlated with the repres- 
sion of a sensitive pWUS::GUS reporter in the organizing centre by the 
exogenous CLV3p but not ACLV3p or flg22 (Fig. 3j-m)”. 

Because CLV3p is specifically expressed and secreted from stem cells 
of the SAM'**!"79, it is critical to examine CLV3p-FLS2 signalling in 
the SAM to evaluate its physiological relevance. Using quantitative 
real-time reverse transcriptase PCR (qRT-PCR) analysis with isolated 
SAM tissues, CLV3p clearly triggered two parallel signalling pathways 
in the SAM (Fig. 4a, b and Supplementary Figs 10-13). The activation 
of important immune marker genes by CLV3p in wild type, clv1-1, 
clv2-1 but not fls2-24 validated the action of innate immune signalling 
via FLS2 in the Arabidopsis SAM. These flg22 and CLV3p inducible 
genes, including FRK1 (a RLK), WRKY29, WRKY30, MYB15, MYB51 
(transcription factors) and PEP3 (a peptide inhibiting P. syringae pv. 
tomato DC3000 growth via RLK signalling), have important roles in 
bacterial and fungal resistance®”'®'*. Consistently, some of these 
marker genes showed reduced endogenous expression in the SAM of 
clv3-2, but were constitutively expressed at higher levels in the SAM 
but not other tissues of wild type (Supplementary Figs 10, 1la and 
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proliferation of pathogenic bacteria in the SAM lacking CLV3p-FLS2 
signalling. P. syringae pv. tomato DC3000-GFP was co-cultivated with 
seedlings for 2 or 4 days. P. syringae pv. tomato DC3000-GFP was visualized 
using a confocal microscope. Scale bars, 50 jim. BF, bright field. 


12). Their endogenous expression levels were low in both the SAM 
and other tissues in clv3-2 and fls2-24 (Supplementary Fig. 12). 
Complementation of clv3-2 for immune marker gene expression in 
the SAM could be achieved with nanomolar range of exogenous 
CLV3p, reflecting the physiological relevance of the Ka for CLV3p 
and FLS2 interaction and CLV3p-FLS2 signalling in the SAM 
(Fig. 2e and Supplementary Fig. 10). The flg22 induction of these 
immune marker genes was observed in the SAM of clv1-1, clv2-1 and 
clv3-2 mutants, supporting the potency of MAMP signalling via FLS2 
(Fig. 4a and Supplementary Fig. 11b). At 10-100 pM flg22, the immune 
marker gene induction levels were reduced in the SAM of clv3-2 com- 
pared to those in the SAM of wild type and clv1-1, indicating the 
operation of both CLV3p-FLS2 and flg22-FLS2 signalling pathways 
in the SAM (Supplementary Fig. 13). The repression of WUS was 
triggered by CLV3p only in the SAM of wild type and fls2-24, but 
not clv1-1 and clv2-1 (Fig. 4b)'7*1729?, 

Most notably, we have never detected the presence of a single live 
P. syringae pv. tomato DC3000-GFP bacterium in the SAM of wild-type 
seedlings in 11 infection experiments by visualizing bacteria prolifera- 
tion using confocal microscopy for up to 4 days (Fig. 4c and Supplemen- 
tary Fig. 14a). Because P. syringae pv. tomato DC3000-GFP could be 
easily visualized in the infected wild-type and clv3-2 cotyledons (Sup- 
plementary Fig. 14b, d), the wild-type SAM appeared to exhibit differ- 
ential immunity (Fig. 4c and Supplementary Fig. 14a). Notably, the SAM 
was no longer protected from P. syringae pv. tomato DC3000-GFP infec- 
tion in fls2-24 and clv3-2 mutants (Fig. 4c and Supplementary Fig. 14c, e, 
f), supporting the important role of endogenous CLV3p in protection of 
the SAM through the FLS2-mediated innate immune signalling pathway 
(Fig. 4c and Supplementary Figs 1 and 14). To demonstrate further and 
quantify the proliferation and growth of P. syringae pv. tomato DC3000- 
GFP in the SAM of the fls2-24 and clv3-2 mutants observed after 2-4 days 
of infection, we counted the increasing bacteria numbers and carried 
out quantitative PCR analysis of the GFP DNA from the bacteria, 
both indicating the loss of the distinct SAM immunity (Fig. 4c and 
Supplementary Figs 14e and 15). There were higher numbers of P. syr- 
ingae pv. tomato DC3000-GFP bacteria in the bigger clv3-2 SAM with 
more cells of similar size but not bigger cells (Supplementary Fig. 14c, e). 
Importantly, P. syringae pv. tomato DC3000-GFP was found to be com- 
pletely excluded from the similarly enlarged clv1-1 and clv2-1 SAM, in 
which CLV3p-FLS2 signalling remained active (Fig. 4c). On the basis of 
the qPCR analysis of GFP DNA derived only from P. syringae pv. tomato 
DC3000-GFP, nonspecific bacteria attachment background could be 
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estimated from the SAM tissue samples 1h after bacteria—seedling co- 
cultivation. Consistent with confocal microscopic observations, active 
bacteria growth and proliferation were exclusively detected only in the 
SAM of the clv3-2 and fls2-24 mutants (Supplementary Fig. 15). 

We have uncovered a surprising mechanism underlying the stem- 
cell-triggered immunity for pathogenic bacteria through CLV3p—FLS2 
signalling (Supplementary Fig. 1). It will be interesting to examine the 
SAM protection from a variety of other pathogens. We propose that 
CLV3p is recognized by two distinct types of receptors involved in 
mostly non-overlapping functions in the SAM (Supplementary Fig. 1). 
The ‘constitutive’ immunity in the SAM resembles flg22 pre-treatment 
as a type of vaccination (before infection), which is more effective to 
confer protection against virulent pathogens such as P. syringae pv. 
tomato DC3000 possessing effectors to cripple MAMP signalling®™*. It 
is surprising that CLV3p-FLS2 signalling seems to have evolved to 
provide constitutive immune protection in the SAM but avoid the 
penalty from potent growth suppression associated with MAMP sig- 
nalling’*'>”°. A future challenge is to elucidate the precise differential 
downstream signalling events via the same receptor in response to 
different peptide ligands. Lacking the genes for antibodies and immune 
cell receptors in humans” to respond to new signals from diverse 
invaders, plant RLKs, displaying high polymorphism and fast evolu- 
tion*, may provide an alternative means to recognize self or non-selfin 
a beneficial manner through constant selections in evolution. It will 
also be important to explore the roles of other secreted plant peptides 


and known or orphan receptors in innate immunity*®’””°. 


METHODS SUMMARY 

Plasmid constructs. The constructs for expressing FLS2-haemagglutinin (HA) 
and BAKI-Flag for co-immunoprecipitation were reported previously’’. 
LRR23b-HA and fls2-24-HA are mutant variants of the LRR domain of FLS2, 
which were generated by site-directed mutagenesis. The fls2-24 allele is a Gly to 
Arg mutation in the 10th LRR domain of FLS2 and is insensitive to flg22 (ref. 25). 
LRR23b is mutated in two amino acids (Gln to Leu and Phe to Leu) in the 23rd 
LRR domain of FLS2 and lacks flg22 signalling”®. 

Mesophyll protoplast transient assays. Protoplast isolation and transient 
expression assays were performed as previously described”"*. For protein express- 
ion in co-immunoprecipitation and in-gel kinase assays, protoplasts were trans- 
fected with plasmid DNA and incubated for 6h at room temperature. Then, 
peptides such as CLV3p and flg22 were added for 10 min to induce MAPK activa- 
tion or FLS2-BAK1 interaction. For qRT-PCR analysis, protoplasts were treated 
with peptides for 1h. 

In-gel kinase assay and co-immunoprecipitation. Both experiments were per- 
formed according to procedures previously described”'*. MBP (Invitrogen) was 
used as a substrate for endogenous MPK3 and MPK6 activation analysis’. For co- 
immunoprecipitation, FLS2-HA and BAK1-Flag were co-immunoprecipitated 
by an anti-Flag antibody (Sigma) and detected by an anti-HA antibody (Roche) 
in immunoblot analysis. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Plasmid constructs. The constructs for expressing FLS2-haemagglutinin (HA) 
and BAK1-Flag for co-immunoprecipitation were reported previously”. 
LRR23b-HA and fls2-24—HA are mutant variants of the LRR domain of FLS2, 
which were generated by site-directed mutagenesis. The fls2-24 allele is a Gly to 
Arg mutation in the 10th LRR domain of FLS2 and is insensitive to flg22 (ref. 25). 
LRR23b is mutated in two amino acids (Gln to Leu and Phe to Leu) in the 23rd 
LRR domain of FLS2 and lacks flg22 signalling”®. 

Mesophyll protoplast transient assays. Protoplast isolation and transient 
expression assays were performed as previously described”. For protein express- 
ion in co-immunoprecipitation and in-gel kinase assays, protoplasts were trans- 
fected with plasmid DNA and incubated for 6h at room temperature. Then, 
peptides such as CLV3p and flg22 were added for 10 min to induce MAPK activa- 
tion or FLS2-BAK1 interaction. For RT-PCR analysis, protoplasts were treated 
with peptides for 1h. 

In-gel kinase assay and co-immunoprecipitation. Both experiments were per- 
formed according to procedures previously described”'*. MBP (Invitrogen) was 
used as a substrate for endogenous MPK3 and MPK6 activation analysis’. For co- 
immunoprecipitation, FLS2-HA and BAK1-Flag were co-immunoprecipitated 
by an anti-Flag antibody (Sigma) and detected by an anti-HA antibody (Roche) 
in immunoblot analysis. 

Plant materials and growth conditions. Col-0 and Ler were used as wild-type 
Arabidopsis plants in this study. The bak1-4, efr-1, and fls2 (Salk_141277) mutants 
are in the Col-0 background'*!>”° and clv1-1, clv2-1, clv3-2 and fls2-24 are in Ler 
background**'’. The clv1-1 mutation is in the kinase domain and represents a 
dominant-negative allele’. The mutation of clv2-1 causes the early stop codon at 
the 33rd residue and is a null allele**. The y-ray-induced clv3-2 is a presumed null 
allele***’. Wild-type and mutant plants were grown on soil at 23 °C, 65% humidity, 
and 75 mol ms! light intensity under 12h light/12h dark photoperiod con- 
ditions for 4 weeks before mesophyll protoplast isolation**. For liquid culture of 
Arabidopsis seedlings, seeds were germinated and grew in 6-well plates containing 
1 ml of liquid medium (0.5x MS and 0.5% sucrose, pH 5.8 adjusted with KOH). 
For GUS assay in the SAM and qRT-PCR analysis using SAM tissues, seedlings 
were grown for 8 days. For growth inhibition analysis by flg22 and CLV3p, seed- 
lings were grown at the same condition for 8 days (Fig. 3a, b and Supplementary 
Fig. 8b) or 6 days (Fig. 3c-e and Supplementary Fig. 7b). 

qRT-PCR analysis. Total RNA was isolated from protoplasts or SAM tissues with 
TRIzol reagent (Invitrogen). To harvest SAM tissues, seedlings were instanta- 
neously frozen by liquid nitrogen in the mortar pre-chilled on the dry ice. 
Cotyledons, hypocotyls and roots were removed using fine forceps. The purity 
of the harvested SAM tissue was confirmed by SAM-specific marker genes and the 
absence of marker genes not expressed in the SAM (Supplementary Fig. 2)'*. First- 
strand cDNA was synthesized from 1 1g of total RNA with M-MLV reverse 
transcriptase (Promega). All qRT-PCR analyses were performed by CFX96 
real-time PCR detection system with iQ SYBR green supermix (Bio-Rad). ACT2 
(ACTIN2, At3g18780) was used as a control gene. 

GUS staining. GUS staining was performed as described”. Plants were fixed in 
90% cold acetone for 20 min and rinsed twice in staining buffer without X-Gluc 
(5-bromo-4-chloro-3-indoxyl-beta-p-glucuronide). Samples were infiltrated with 
staining buffer (100 mM NaPO, buffer, pH7.0; 0.5mM ferrocyanide; 0.5 mM 
ferricyanide; 0.1% Triton X-100; 10mM EDTA, pH 8.0; 1mM X-Gluc (Gold 
Biotechnology)) under vacuum for 10min and incubated at 37°C for 3h. 
After then, staining buffer was removed and dehydrated up to 70% ethanol. 
Microscopic analysis was carried out with Leica DFC 500 camera mounted on 
Leica MZI16F. 

Bacterial infection assay in the SAM. Nine seeds were sowed in 1-ml liquid 
medium in 6-well plates and grown under constant light (50-65 tmolm 7s” ') 
at 25-27 °C without shaking. P. syringae pv. tomato DC3000-GFP culture was 
grown in KB liquid medium (50 pg ml of rifampicin and 15 pg ml! of tetracyc- 
lin) with shaking at 28°C. Overnight cultured P. syringae pv. tomato DC3000- 
GFP were washed twice with water and diluted to an optical density at 600 nm 
(OD¢oo) of 0.02. Diluted P. syringae pv. tomato DC3000-GFP (50 pl) was added 
into the liquid medium with 2-day-old seedlings. Plants and bacteria were co- 
cultivated with gentle shaking (50 r.p.m.) for 2, 3 or 4 days under constant light. 
For observation of bacteria-infected SAM, co-cultivated seedlings were washed 
with 70% ethanol twice and rinsed twice with water. All seedlings were placed on 
glass slides with 1 ml of water, and squashed gently by coverslips for bacterial 
observation using a confocal laser-scanning microscope (Leica TCS-NT). The 
experiment was repeated at least five times with similar results in Ler, fls2-24, 
clv3-2, clv1-1 and clv2-1 (Fig. 4c). For the qPCR analysis of DC3000-GFP, diluted 
P. syringae pv. tomato Dc3000-GFP (ODg00 = 0.5, 200 pl) was co-cultivated with 
wild-type or mutant seedlings for 3 or 4 days. After washing and rinsing seedlings 
twice, tissues from five SAMs were harvested and ground in 100 l of water. 


Control experiments were conducted at 0 days after co-cultivation to determine 
nonspecific bacterial attachment 1 h after co-cultivation (black bar). The experi- 
ment was repeated three times with similar results. The GFP level determined by 
qPCR was correlated with specific bacterial growth, and normalized based on the 
Arabidopsis ACT2 gene in the SAM tissues. 

Seedling pathogen assay. The protocol was modified from that previously 
described*”**. Nine seeds were sowed in 6-well plates containing 1 ml of liquid 
medium. Plants were grown under constant light at 25-27 °C without shaking. P. 
syringae pv. tomato DC3000 culture was grown in KB liquid medium (50 pg ml * 
of rifampicin) with shaking at 28 °C. Overnight cultured P. syringae pv. tomato 
DC3000 was washed twice with water and diluted to OD¢09 = 0.02, 1 X 10’ c.f.u. 
ml’. After 6 days of seedling growth, 50 yl of diluted P. syringae pv. tomato 
DC3000 was added into 1 ml of fresh 0.5X MS liquid medium without sucrose. 
After adding P. syringae pv. tomato DC3000, plates containing plants and bacteria 
were co-cultivated with gentle shaking (<50 r.p.m.) for 1 day under constant light 
(50-65 mol m 7s‘). For CLV3p- and flg22-induced immunity, seedlings were 
treated with 1 1M or 101M of CLV3p or 100 nM of flg22 peptide 1 day before 
adding P. syringae pv. tomato DC3000. For bacterial counting, co-cultivated seed- 
lings were washed with 70% ethanol twice and rinsed twice with water. Then, three 
seedlings were put into each of three 1.5-ml tubes containing 100 ul of water and 
ground by a hand drill and blue pestles. After preparing serial dilutions, 10 ul of 
diluted bacteria (from10~* to 10° °) was spread on the KB plates and incubated for 
2 days at 28 °C before counting. 

Assay for CLV3p-mediated SAM arrest. Twenty seeds of Col-0 and fls2 mutant 
were sowed in Petri dishes (100 X 25 mm) containing 10 ml of liquid medium with 
or without 1 14M CLV3p. Plants were grown in a growth chamber at 23 °C under 
short day condition (8 h light/16h dark, 75 mol m “s_‘) for 4weeks. The SAM 
tissues were cut and fixed using 4% (w/v) paraformaldehyde/4% (v/v) DMSO at 
4°C overnight. Collected samples were dehydrated through ethanol series (30%, 
50%, 70%, 95% for 1h in each step) at 4 °C and stained with 0.1% Eosin Y (Sigma) 
in 100% ethanol at 4 °C overnight. Ethanol was changed through histoclear series 
(50% ethanol: 50% histoclear, 100% histoclear, 100% histoclear for 1h in each 
step) (National Diagnostics). Histoclear was then gradually changed with melted 
paraffin (Fisher Scientific) in a 60°C chamber. Replacement of freshly melted 
paraffin was performed for 4 days. Paraffin-embedded tissues were poured into 
the mould and adjusted in appropriated position. Section was carried out with a 
rotary microtome (Leica RM2255) at 8-11m thickness. Sectioned ribbons were 
placed on poly-prep glass slides (Sigma) with pre-warm water and incubated on 
a slide warmer (Fisher Scientific) at 42 °C overnight. For meristem staining, par- 
affin of sections was removed by 100% histoclear twice for 10 min. Sectioned 
tissues were hydrated through reverse ethanol series (100%, 70%, 30% ethanol 
and water for 2 min in each step). Sections were stained with 0.1% Giemsa (Sigma) 
for 5 min and rinsed briefly with water. Stained sections were dehydrated through 
ethanol series (2 min in each step) and transferred to 100% histoclear for 2 min. 
For microscopic analysis, samples were dried and mounted with Cytoseal 60 
(Richard-Allan Scientific) before taking pictures using Leica DM5000B. 

Direct binding assay. The protocol was modified from that previously 
described’”***?°. For reparation of receptor proteins, FLS2-HA, fls2-24-HA, 
LRR23b-HA and BRII-HA were expressed in fls2 protoplasts (2.5 X 10° cells) 
for 6h. Protoplasts were harvested and then re-suspended in 500 ul of binding 
buffer (25 mM MES/KOH, pH 5.8, 3 mM MgCl, 10mM NaCl) with 2mM DTT 
and protease inhibitor (Roche). Cells were vigorously mixed by vortex and kept on 
ice for 5 min, and centrifuged at 10,000g for 20 min to yield pellet. The pellets were 
re-suspended in 100 ul of binding buffer and the protein concentration from re- 
suspended extract was measured by a NanoDrop 1000 spectrophotometer. This 
extract contained 50g of total protein. For the binding assay, re-suspended 
extracts (100 pl) in binding buffer were mixed with 5) Tyr-CLV3p (100 fmol 
in each sample) without or with unlabelled Tyr-CLV3p as a competitor for 
30 min on ice. The average of specific radioactivity in five different batches of 
1-Tyr-CLV3p was 2,023.52 Cimmol '. '*°I-Tyr-CLV3p bound extracts were 
collected by a vacuum filtration system through glass fibre filters (Macherey-Nagel 
MN GEF-2, 2.5-cm diameter), which were pre-incubated with 1% BSA, 1% bacto- 
trypton, 1% bactopepton in binding buffer. Filters were washed with 15 ml of cold 
binding buffer and the retained radioactivity was determined by a gamma counter 
Beckman LS6500. Specific binding was measured by subtracting nonspecific bind- 
ing (with 10-20 1M unlabelled Tyr-CLV3p competitor) from total binding (with- 
out competitor). In the saturation binding assay, the nonspecific binding in the 
presence of 20 1M unlabelled Tyr-CLV3p competitor showed linear increase with 
1 to 100nM '*°I-Tyr-CLV3p and accounted for 40-60% of total binding, which 
was similar to the range of nonspecific binding observed with '*I-Tyr-flg22 
binding to intact cells (Fig. 3a, Bauer et al., 2001)*’. The dissociation constant 
(Kg), Scatchard plot and the nonlinear regression analysis of competition assay 
were presented using the Prism 5 program (GraphPad). From the saturation data 
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of Bmax =0.53pmolmg ' protein and 50pg protein from 2.5 X 10° cells, 
6.36 X 10* '°I-Tyr-CLV3p binding sites per cell could be estimated in total cell 
extracts. Specific CLV3p binding to FLS2 was similar to CLV3p binding to CLV1 
and CLV2, the established CLV3p receptors**”. 
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A novel protein family mediates Casparian strip 
formation in the endodermis 
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Polarized epithelia are fundamental to multicellular life. In animal 
epithelia, conserved junctional complexes establish membrane dif- 
fusion barriers, cellular adherence and sealing of the extracellular 
space’. Plant cellular barriers are of independent evolutionary 
origin. The root endodermis strongly resembles a polarized epi- 
thelium and functions in nutrient uptake and stress resistance’. Its 
defining features are the Casparian strips, belts of specialized 
cell wall material that generate an extracellular diffusion barrier’. 
The mechanisms localizing Casparian strips are unknown. Here 
we identify and characterize a family of transmembrane proteins 
of previously unknown function. These ‘CASPs’ (Casparian 
strip membrane domain proteins) specifically mark a membrane 
domain that predicts the formation of Casparian strips. CASP1 
displays numerous features required for a constituent of a plant 
junctional complex: it forms complexes with other CASPs; it 
becomes immobile upon localization; and it sediments like a large 
polymer. CASP double mutants display disorganized Casparian 
strips, demonstrating a role for CASPs in structuring and localiz- 
ing this cell wall modification. To our knowledge, CASPs are the 
first molecular factors that are shown to establish a plasma mem- 
brane and extracellular diffusion barrier in plants, and represent a 
novel way of epithelial barrier formation in eukaryotes. 

In 1865 Robert Caspary described a novel cell type surrounding the 
vascular cylinder of roots’. Caspary named this layer ‘Schutzscheide’, 
protective sheath, but during the twentieth century it was termed 
endodermis, the ‘inner skin’ of plants. Its defining, belt-like cell wall 
thickenings became known as Casparian strips. The ligno-suberic 
polymers of the Casparian strips impregnate the primary cell walls 
between cells, providing a localized diffusion barrier**. The appear- 
ance of Casparian strips in evolution coincides with that of vascular 
tissues, and a huge body of physiological work proposes numerous 
functions for Casparian strips”® *. Despite their importance, and our 
significant insights into endodermal specification?"', we remain 
ignorant about the molecules that govern endodermal differentiation. 
Recently, polarized distribution of transporters within the endodermis 
was demonstrated'**. Moreover, a central domain, underlying the 
Casparian strip (Casparian strip membrane domain, CSD) was shown 
to separate the polar domains and prevent lateral diffusion’. This CSD 
displays tight matrix attachment and appears very electron dense in 
electron micrographs’*. Despite obvious functional resemblances 
between endodermis and animal epithelia’®, plants lack most compo- 
nents of animal junctional complexes, and plant cell wall thickness 
does not allow for direct, protein-mediated cell-cell contact, crucial for 
animal tight junctions. Therefore, sealing must rely on the coordinated 
deposition of hydrophobic cell wall material by some completely 
unknown mechanism. 

In Arabidopsis thaliana, Casparian strips are endodermis-specific 
structures. We therefore searched microarrays'’ for endodermis- 
enriched genes, the products of which were predicted to be secreted 


or plasma-membrane localized. Five proteins (>45% amino acid 
identity) were identified that belong to the Arabidopsis ‘uncharacterized 
protein family’ UPF0497 (38 members, Supplementary Fig. 1a). Family 
members have a predicted topology of four-membrane spans with 
cytosolic amino and carboxy termini and conserved extracellular loops 
(Fig. la and Supplementary Fig. 1b). Reporter lines showed strong 
and specific transcription in the endodermis, starting in the elongation 
zone (Supplementary Fig. 2a). Fluorescent protein fusions of all five 
proteins under endogenous or endodermis-expressing SCARECROW 
(SCR) promoter’® showed very restricted localization within the 
plasma membrane, coinciding with the position of the CSD (Fig. 1b- 
d and Supplementary Fig. 2b, c). The CASP1-GFP signals are strictly 
complementary with those of markers for peripheral (outer) and 
central (inner) plasma membrane domains’* (Fig. le, f and Sup- 
plementary Fig. 3). As the first proteins marking the CSD, we termed 
them Casparian strip membrane domain proteins 1 to 5 (CASP1-5). In 
differentiated endodermal cells, Casparian strips form a supracellular 
network of crosslinked cell walls’’. Three-dimensional reconstructions 
of CASP1-GFP revealed a markedly similar network of precisely 
aligned belts of CASP1-GFP signals forming a cylindrical network 
around the stele (Supplementary Movie 1). By immuno-electron 
microscopy, we found that width and position of the CASP1-GFP 
signals at the plasma membrane precisely coincided with the 
Casparian strip itself (Fig. 1g, h and Supplementary Fig. 4). CASP1- 
GFP signal was aligned between neighbouring cells, but not in direct 
contact. In addition, CASP1 signals at the membrane also matched the 
zone of adherence to the Casparian strip. 

CASPs could simply be associated proteins of an established 
Casparian strip network. Alternatively, CASP localization could precede 
and determine the site of Casparian strip deposition. To discriminate 
between these possibilities, we developmentally staged the onset of 
CASP1 expression, with respect to CSD appearance and Casparian strip 
diffusion barrier formation (Supplementary Fig. 5). In Arabidopsis, 
strict division patterns allow staging by counting the cellular distance 
from the meristem’*. We found that CASP1 transcription precedes 
formation of the CSD by 2.6 + 0.8 cells, visualized as a zone of protein 
exclusion. This matches the difference between initial CASP1 protein 
accumulation and its final localization (2.0 cells). Moreover, CASP1 
localization precedes the establishment of a functional diffusion barrier 
by 4.8 + 1.5 cells. Thus, CASPs precede Casparian strip establishment 
and are early markers of CSD formation. Moreover, live imaging of early 
CASP1-GFP accumulation indicates that CASPs could be involved in 
generating the CSD (Fig. 2a and Supplementary Movie 2). CASP1-GFP 
shows an initially uniform distribution at the plasma membrane. Signal 
strength then increases and accumulation at sites of incipient CSDs is 
observed. Localization in other plasma membrane regions gradually 
disappears and signals at the forming CSDs become stronger and more 
defined. Figure 2a shows a representative time lapse of this progression, 
taking place within 2 h (Supplementary Movie 2). Intriguingly, imaging 
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Figure 1 | A new protein family localizes to the Casparian strip membrane 
domain. a, Predicted topology of CASP1. b, Cartoon of the observed dots in an 
optical section that represent a longitudinal band in three dimensions. 

c-f, CASP1 (red) is confined to the CSD (c), is complementary to NPSN12 
(d), and outer (NIP5;1, e) and inner (BOR1(Y373A/Y398A/Y405A), 

f) markers. Che, mCherry; Cit, citrine; en, endodermis; Tur, mTurquoise. 

g, Casparian strips (CS) appear homogeneous in electron micrographs. CSD 
adheres to Casparian strips. CW, cell wall; PM, plasma membrane; PMS, space 
generated by plasmolysis. h, Immunogold-electron micrograph of CASP1- 
GFP. Gold particles reside at either side of the Casparian strip. Scale bars: 
c-f, 5 um; h, i, 250 nm. 


of the cell surface reveals that localization proceeds through a ‘string-of- 
pearls’ stage where CASP1-GFP patches appear along the equatorial 
line of the cell, gradually coalescing into a continuous band (Fig. 2b 
and Supplementary Movie 3). Initial random distribution, gradual re- 
location and patching at the future site of Casparian-strip formation all 
indicate that CASPs are not locating to a previously established domain 
but are associated with its formation. When expressed ectopically, most 
CASPs reach the plasma membrane, but none accumulates in CSD-like 
structures, indicating that unknown endodermis-specific factors, 
possibly other CASPs, are necessary for their localization (Supplemen- 
tary Fig. 6). Re-localization of CASP1-GFP should be associated with 
endocytosis and secretion of the protein. Indeed, CASP1-GFP accumu- 
lates in endosomal aggregates in response to brefeldin A (BFA), indi- 
cative of active endocytosis (Fig. 2c)”. However, this accumulation 
eventually disappears in differentiated cells where CASP1 localization 
has become confined to the CSD (Fig. 2d). Other proteins continued to 
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Figure 2 | CASP1 protein gradually localizes to a stable central domain. 

a, b, CASP1-GFP time-lapse images. Numbers at the bottom of panels a and 
b indicate time in minutes. Central (a) and surface (b) optical cuts are shown. 
c, d, Effects of BFA on CASP1 localization. CASP1-GFP endosomal aggregates 
form in early (c, yellow in overlay) but not in late stages (d, red in overlay). 
Accumulation of FM4-64 (red) indicates ongoing BFA-sensitive recycling. 
Arrowheads point to endosomal aggregates. ct, cortex; en, endodermis; st, stele. 
Scale bars: 10 jum. e, FRAP experiments with CASP1-GFP (red). Recovery is 
observed for CASP1-GFP (red) in the first expressing cell (1 cell), but not at 
later stages (4 cell). mCitrine-NIP5;1 (green) is a control protein. n = 7-9 
independent assays for each series, error bars = s.d. 


display BFA sensitivity (Supplementary Fig. 7). This suggests very low 
endocytosis of localized CASP1-GFP. Furthermore, CASP1-GFP only 
shows measurable rates of fluorescence recovery after photobleaching 
(FRAP) in early-expressing cells, but becomes immobile upon accu- 
mulation at the position of the CSD, strongly contrasting with other 
endodermal plasma membrane markers (Fig. 2e). This indicates that 
localized CASP1 has very low rates of lateral diffusion. Together, this 
suggests an extensive scaffolding and/or matrix interaction of CASP1. 
This behaviour of CASP1 perfectly matches the characteristics of the 
CSD itself, which we have shown to act as a molecular fence, excluding 
other membrane proteins and blocking diffusion between central and 
peripheral plasma membrane domains”. 

A straightforward explanation for the immobility of CASP1-GFP 
and the presence of gradually fusing patches would be that CASPs 
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form polymeric lattices within the plane of the membrane. When 
ectopically expressed, GFP fusions of CASP1, CASP2, CASP3 and 
CASP4 accumulate at the plasma membrane in a non-localized fashion, 
occasionally labelling intracellular compartments (Fig. 3a and Sup- 
plementary Fig. 6). CASP5-GFP, however, shows strong accumulation 
in large, irregular, intracellular compartments (Fig. 3b). By structural 
and immunogold electron microscopy we found that misexpressed 
CASP5-GFP induces altered domains of endoplasmic reticulum (ER) 
in which it accumulates (Fig. 3c—-e and Supplementary Fig. 8). Exten- 
sive, flattened ER cisternae often organized into large, multilamellar 
structures. This CASP5-induced re-organization and aggregation of 
membranes could best be explained by CASP5 forming extensive 
protein scaffolds within the ER. Clearly, our findings indicate that 
CASP5, especially, requires endodermis-specific factors, possibly other 
CASPs, for its exit from the ER. Using CASP1-GFP immunoprecipita- 
tion/mass spectrometry (IP/MS) analysis, we then tested whether 
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Figure 3 | CASP proteins interact with each other and show behaviour of 
large, polymeric protein assemblies. a, b, Ectopically expressed CASP1-GFP 
(a) and CASP5-GFP (b). ct, cortex; en, endodermis; ep, epidermis; st, stele. 
Scale bars: 10 tum. c-e, Epon embedded (c, d) and immunogold labelled 

(e) 35S::CASP5-GFP. Aberrant ER cisternae (black arrowheads, c, d) are 
labelled with anti-GFP (e, white arrowheads showing parallel bilayers). Scale 
bar: 250 nm. f, CASP1-GFP/CASP3-mCherry interaction. CASP3—mCherry is 
observed as lower molecular mass forms (asterisk indicates expected molecular 
mass of 52 kDa). g, CASP1-GFP in low-speed pellets. S¢ 599, supernatant; 
Srxioo, supernatant after solubilization with Triton X-100. KNOLLE, PM- 
ATPase and BRI1 show contrasting fractionation. h, Solubilization with Na- 
citrate buffer (pH2), Na2Co; buffer (pH11), 5% B-mercaptoethanol (B-ME), 
1M NaCl and 5% Triton X-100. P, pellet; S, supernatant. 
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CASPs reside in complexes with each other. By immunoprecipita- 
tions with anti-GFP antibodies, using a CASP1::;CASP1-GFP line, we 
identified CASP3 as an interacting protein (Supplementary Fig. 9) and 
confirmed this by co-immunoprecipitation using a CASP1::;CASP1- 
GFP, SCR::CASP3-mCherry double marker line (Fig. 3f). Similarly, 
we demonstrated interactions of multiple other CASP pairs (Sup- 
plementary Fig. 10), supporting the idea of extensive multivalent inter- 
actions between CASPs. These interactions may contribute to CASP 
localization at the CSD, by trapping newly arriving CASPs through 
polymerization. 

A polymeric lattice of CASPs that surrounds an entire cell should 
have particular physical properties, because of its considerable size and 
mass. We therefore compared solubility of CASP1-GFP with that of 
other plasma membrane proteins and found that under native extrac- 
tion conditions, the large majority of CASP1-GFP was not recovered 
in the microsomal fraction, as is the case for most transmembrane 
proteins (Fig. 3g). Instead, CASP1-GFP was found associated with 
low-speed pellets from which it could only be eluted by strong deter- 
gents (Fig. 3h and Supplementary Fig. 11). Polymer formation could 
easily account for this unusual behaviour. The conditions needed for 
CASP1-GFP release from pellets are markedly similar to those of 
Casparian-strip-associated proteins whose purification was attempted 
previously”'. Clearly, fractionation into low-speed pellets could also be 
due to association of CASPs with the Casparian strip, and polymeriza- 
tion and cell wall association are obviously not mutually exclusive. 

Our data indicate that CASPs establish the CSD, possibly guided by 
some unidentified earlier positional cue. This, in turn, determines the 
localization of the Casparian strip itself and the establishment of an 
apoplastic barrier’. To demonstrate this, we analysed available 
mutants for the small CASP genes. Neither insertion lines nor targeting 
induced local lesions in genomes (TILLING) stop mutations were 
found for CASP2 and CASP4 (Supplementary Figs 1c and 12). The best 
casp1 allele identified was an insertion line that shows 85% knockdown 
of messenger RNA levels (casp1-1). Knockout alleles were identified for 
CASP3 (casp3-1 and casp3-2) and a strong knockdown for CASP5 
(casp5-1). Casparian strips are easily visualized by their intrinsic 
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Figure 4 | CASP proteins are necessary for the correct structure and 
localization of the Casparian strips. a—e, Surface view of Casparian strip 
network, detected as autofluorescence after clearing with two-photon 
excitation. Z-stack projections (0.38 jum z-resolution), taken with identical 
settings. f, g, Higher magnification images of wild type (f) and casp1-1 casp3-1 
(g) with settings optimized for image quality. Scale bars: 10 jum. 
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autofluorescence after clearing of the roots’’. Using this test, we did not 
observe any difference in Casparian strip formation between single 
casp1 or casp3 mutants and wild type (Fig. 4a—c). Absence of a pheno- 
type in single mutants could be expected considering the similarity and 
identical expression and localization pattern of the five family mem- 
bers. RNA interference approaches were efficient in reducing overall 
mRNA levels of CASPs, but not the protein levels of CASP-GFP 
fusions in differentiating cells (Supplementary Fig. 12). Yet, combining 
caspI-1 and casp3-1 insertion mutants did yield obvious defects in 
Casparian strip formation. Instead of the strictly confined signal of 
wild-type Casparian strips, we noticed a much stronger and more 
diffuse autofluorescence in the double mutant (Fig. 4d), which was 
complemented by a CASP1::;CASP1-GFP transgene (Fig. 4e). Three- 
dimensional reconstructions revealed that casp1-1 casp3-1 mutants 
deposit autofluorescent cell wall material everywhere in the transversal 
and anticlinal walls with some preference for the cell corners. At the 
position of the Casparian strip, irregular, non-contiguous patches of 
signal are observed (Fig. 4f, g and Supplementary Movie 4). This 
demonstrates that CASPs are needed for the deposition of Casparian 
strip material into a contiguous, centrally positioned ring, but are not 
required for its synthesis or polymerization. This aberrantly structured 
Casparian strip apparently remains functional, as the apoplastic tracer 
propidium iodide is still blocked at the endodermis in the double mutant. 
Consistently, no obvious growth defects are observed in casp1-1 casp3-1 
plants. We expect to see strong phenotypes only in quintuple mutants 
that eliminate the activity of all CASP family members. 

Our analysis indicates that CASPs are central, probably structural, 
components of the Casparian strip membrane domain, forming a tight, 
polymeric scaffold within the membrane. We speculate that the CASPs 
serve as a platform to localize and immobilize cell wall biosynthetic 
enzymes and that matrix adhesion is mediated either through direct 
interaction or interacting proteins. Identification of CASP-regulatory 
factors and associated proteins will allow a mechanistic understanding 
of Casparian strip formation, and enable us to revisit long-standing 
concepts about root nutrient uptake through specific manipulation of 
Casparian strip formation. CASP-related proteins probably perform 
related functions in other cell types. Their common role might be to 
form membrane protein platforms for the localization of cell-wall- 
modifying activities or localized adhesion. Mechanisms of localized 
wall deposition and matrix adhesion in plants remain poorly under- 
stood processes, and the identification and characterization of the 
CASPs and their related proteins now provide an intriguing new 
avenue for their mechanistic dissection. 


METHODS SUMMARY 


For FRAP, 1-h treatment with 50 mM 2-deoxyglucose/0.02% sodium azide and 
bleaching on 2 jm? was carried out. For time lapse, seedlings were transferred to 
Lab-Tek II chambered coverglass and covered with agar block. MBF ImageJ 
bundle was used for analysis. Native protein extraction was done with 100mg 
of roots, 200 ul extraction buffer ((in mM): 50 HEPES, pH 7.9, 300 sucrose, 150 
NaCl, 5 EDTA, 10 CH3COOK, 2.5 Roche Complete, 1 PMSF); centrifugation for 
5 min at 6,500g, 4°C; washes, 3X in extraction buffer. Solubilization was per- 
formed with extraction buffer plus detergents. For immunoprecipitation, pellet 
obtained after centrifugation at 6,500g (P¢,so0) from 450 mg of roots was treated for 
30 min on ice with 450 ll extraction buffer plus 1% CHAPS, and then spun for 
30 min at 19,000g at 4 °C. Then, 410 tl were incubated on ice for 30 min with 50 pl 
of |UMACS anti-GFP microbeads. Beads were washed on column with 4X 200 ul, 
1X 100 pl extraction buffer +1% CHAPS. Elution was performed at 95 °C with 
50 pl 1X MACS extraction buffer. For IP-MS, 2 g of roots were used; P¢,s90 was 
re-suspended in extraction buffer + 0.5% Triton X-100. Anti-GFP (1:1,000) was 
from Invitrogen, anti-DsRed (1:1,000) was from Clontech. Sequences have been 
deposited in GenBank (HQ699533-50). Inhibitor and tracer treatments"’, high- 
pressure freezing”, immunogold localization” and IP-MS” were carried out as 
described previously. 


LETTER 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Optical microscopy. Images were taken with a Leica SP/2 confocal microscope. 
Excitation and detection windows were set as follows: GFP 488 nm, 500-600 nm; 
mCherry 594 nm, 600-700 nm; GFP and FM4-64/propidium iodide 488 nm, 500- 
550nm and 600-700 nm; Citrine/YFP and mCherry 514nm and 594nm, 520- 
560 nm and 600-700 nm. Images for CASP1-GFP three-dimensional reconstruction 
were taken with a 2-photon Zeiss LSM 710 NLO confocal, with a Chameleon Ultra II 
Ti:Sapphire laser at 820 nm using the non-descanned detector with a 500-550 band- 
path filter. For autofluorescence of Casparian strips, excitation at 770 nm was used, 
with the same bandpath filter. BFA, propidium iodide and FM4-64 treatments were 
done as in ref. 13. 

Electron microscopy. For ultrastructural analysis, roots of A. thaliana were cryo- 
fixed by high-pressure freezing (Baltec HPM 010) in hexadecane, freeze-substituted 
in acetone containing 2.5% osmium tetroxide and infiltrated with epon at 0°C. 
Ultrathin resin sections were stained with 2% uranyl acetate in 50% ethanol and lead 
citrate’. For immunogold localization of GFP fusion proteins (CASP1-GFP and 
CASP5-GFP), roots were fixed with 4% formaldehyde and 8% formaldehyde for 
45 min and 90-120 min, respectively. Root tips were embedded in 10% gelatine and 
infiltrated with a mixture of polyvinylpyrrolidone and sucrose’. After freezing, 
ultrathin cryosections were sectioned using a Leica Ultracut UCT/EMFCS and the 
frozen sections were transferred to electron microscopic grids for immunogold label- 
ling. The thawed sections were incubated with rabbit anti-GFP (1:500; Torrey Pines) 
and goat anti-rabbit F(ab), fragments coupled to Nanogold (1:50; Nanoprobes). 
After silver enhancement (HQ Silver, 8 min; Nanoprobes) sections were embedded 
in methyl cellulose containing 0.45% uranyl acetate. Ultrathin resin and cryosections 
were viewed in a LEO 906 transmission electron microscope at 80 kV accelerating 
voltage. GFP-immunogold labelling on wild-type cells was negligible. 

FRAP. Five-day-old seedlings were treated for 1 h with 50 mM 2-deoxyglucose and 
0.02% sodium azide, and then imaged with a Leica SP/2 confocal microscope. 
Bleaching was performed on an area of 2 tm” by excitation at 488 nm for GFP, at 
488 nm, 496 nm and 514 nm for citrine. Images were taken with the same settings 
before bleaching (time 0) and then every minute for 20 min. Fluorescence intensities 
were measured with ImageJ. Intensities were corrected by subtracting background 
and forming ratios using intensities from areas without signal (background) and 
non-bleached areas with signal (measuring overall bleaching). Finally, pre-bleaching 
and post-bleaching fluorescence were set at 100% and 0%, respectively, to allow 
averaging of individual experiments. After one pre-bleach scan, bleaching was done 
by one scan with 16 line average, 2X frame average, maximally zooming onto the 
bleach region of interest at maximal laser power. The first post-bleach scan was done 
at minute one after the pre-bleach scan. For a more detailed description ofa similar 
protocol, see ref. 26. 

Time-lapse imaging of CASP1 expression. For confocal time-lapse imaging in 
CASP1::CASP1-GFP, 5-day-old seedlings were transferred into a Lab-Tek II 
chambered coverglass (Nunc) and covered with a small block of agar to prevent 
drying. Subsequently, slides were mounted and imaged on an inverted Leica SP/2 
confocal microscope. Image stacks were taken every 3 min for a total time of 2h. 
Confocal images were analysed and processed using the IMAGEJ plugin of the 
MBF IMAGE] bundle. 

Protein fractionation. One-hundred milligrams of roots from 5-day-old 
seedlings were frozen in liquid nitrogen and ground using Qiagen Tissue-lyser. 


Two-hundred microlitres of extraction buffer (50 mM HEPES pH 7.9, 300 mM 
sucrose, 150mM NaCl, 5mM EDTA, 10mM potassium acetate, 2.5x Roche 
Complete Protein inhibitors, 1mM PMSF) were then added and samples 
were spun at 6,500g, 4°C for 5min. Supernatant (S¢599) was collected and 
pellets washed three times with extraction buffer. The last pellet was treated 
with extraction buffer containing additional detergents (see figure legends for 
concentrations). 

Immunoprecipitation/MS analysis. All experiments were done using tagged 
CASP variants expressed under their endogenous promoters. P¢,500 from 
450 mg of roots was treated for 30 min on ice with 450 ll extraction buffer plus 
1% CHAPS, and then spun at 19,000g, 4 °C for 30 min. 410 pl were incubated on 
ice for 30min with 50 l of MACS anti-GFP microbeads. Beads were than 
washed on column with 4X 200, and 1X 100 pl extraction buffer plus 1% 
CHAPS. Elution was performed at 95°C with 50 pl 1X MACS elution buffer. 
For western blots, 15 pl (3.3% of input, 30% of output) were charged on gel. GFP 
was probed with a rabbit polyclonal anti-GFP (Invitrogen, 1:1,000), mCherry with 
a rabbit polyclonal anti-DsRed (Clontech, 1:1,000). An HRP-coupled anti-rabbit 
secondary antibody (Promega) was used at 1:1,000 for GFP, 1:20,000 for mCherry. 
For IP-MS, 2 g of roots were used; P¢,599 was re-suspended in extraction buffer + 
0.5% Triton X-100. After gel migration and trypsinization, samples were analysed 
on a hybrid linear trap LTQ-Orbitrap mass spectrometer (Thermo Fisher) inter- 
faced to an Agilent 1100 nano HPLC system. Peptides were loaded onto a trapping 
microcolumn ZORBAX 300SB C18 (Agilent), eluted after 5 min and separated on 
a reversed-phase nanocolumn ZORBAX 300SB C18 column (Agilent). A 400 
nozzle ESI Chip (Advion Biosciences) was used for spraying, with a voltage of 
1.65 kV and mass spectrometer capillary transfer temperature of 200 °C. In data- 
dependent acquisition controlled by Xcalibur 2.0.7 software (Thermo Fisher), the 
six most intense precursor ions detected in the full MS survey were selected and 
fragmented. All MS/MS samples were analysed using Mascot 2.2 searching 
UNIPROT database. Scaffold_3 was used to validate MS/MS-based peptide and 
protein identifications, and to perform data set alignment. 90.0% and 95% prob- 
ability thresholds were used to identify peptide and proteins, respectively. 
Transgenic lines. Transgenic lines were generated by floral dipping of A. thaliana, 
ecotype Col-0 (ref. 27). Sequences of plasmids used to generate transgenic lines are 
deposited in GenBank (HQ699533-50). RNA interference was done using the 
following clones from the AGRIKOLA project: CASP1 CATMA2a34330 (nucleo- 
tides 2-231 of the CDS); CASP2 CATMA3a10500 (nucleotides 595-754 of the 
CDS and 1-139 of the 3’ UTR); CASP3 CATMA 2a25770 (nucleotides 2-161 of the 
CDS); CASP5 CATMA5al3570 (nucleotides 3-162 of the CDS). Artificial 
microRNAs were designed using the Web microRNA Designer (http://wmd2.wei- 
gelworld.org) and expressed under the CASP1 promoter. The following positions 
were targeted: CASP1 nucleotides 547-567 of the CDS; CASP2 nucleotides 66-86 
of the 3’ UTR; CASP3 nucleotides 624-644 of the CDS; CASP4 nucleotides 277- 
297 of the CDS; CASP5 nucleotides 150-170 of the CDS. 
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BCL6 enables Ph* acute lymphoblastic leukaemia 
cells to survive BCR-ABL]I kinase inhibition 


Cihangir Duy’, Christian Hurtz°, Seyedmehdi Shojaee’, Leandro Cerchietti?, Huimin Geng’, Srividya Swaminathan!, 
Lars Klemm!, Soo-mi Kweon?, Rahul Nahar’, Melanie Braig*, Eugene Park’, Yong-mi Kim’, Wolf-Karsten Hofmann’, 
Sebastian Herzog®, Hassan Jumaa’, H. Phillip Koeffler’, J. Jessica Yu®, Nora Heisterkamp”, Thomas G. Graeber?, Hong Wu’, 


B. Hilda Ye®, Ari Melnick® & Markus Miischen!? 


Tyrosine kinase inhibitors (TKIs) are widely used to treat patients 
with leukaemia driven by BCR-ABLI (ref. 1) and other oncogenic 
tyrosine kinases”*. Recent efforts have focused on developing more 
potent TKIs that also inhibit mutant tyrosine kinases**. However, 
even effective TKIs typically fail to eradicate leukaemia-initiating 
cells (LICs)**, which often cause recurrence of leukaemia after 
initially successful treatment. Here we report the discovery of a 
novel mechanism of drug resistance, which is based on protective 
feedback signalling of leukaemia cells in response to treatment 
with TKI. We identify BCL6 as a central component of this drug- 
resistance pathway and demonstrate that targeted inhibition of 
BCL6 leads to eradication of drug-resistant and leukaemia-initiating 
subclones. 

BCL6 is a known proto-oncogene that is often translocated in diffuse 
large B-cell lymphoma (DLBCL)’. In response to TKI treatment, BCR- 
ABLI acute lymphoblastic leukaemia (ALL) cells upregulate BCL6 
protein levels by approximately 90-fold: that is, to similar levels as in 
DLBCL (Fig. 1a). Upregulation of BCL6 in response to TKI treatment 
represents a novel defence mechanism, which enables leukaemia cells 
to survive TKI treatment: Previous work suggested that TKI-mediated 
cell death is largely p53 independent. Here we demonstrate that BCL6 
upregulation upon TKI treatment leads to transcriptional inactivation 
of the p53 pathway. BCL6-deficient leukaemia cells fail to inactivate 
p53 and are particularly sensitive to TKI treatment. BCL6 ‘~ leuk- 
aemia cells are poised to undergo cellular senescence and fail to initiate 
leukaemia in serial transplant recipients. A combination of TKI treat- 
ment and a novel BCL6 peptide inhibitor markedly increased survival 
of NOD/SCID mice xenografted with patient-derived BCR-ABL1 ALL 
cells. We propose that dual targeting of oncogenic tyrosine kinases and 
BCL6-dependent feedback (Supplementary Fig. 1) represents a novel 
strategy to eradicate drug-resistant and leukaemia-initiating subclones 
in tyrosine-kinase-driven leukaemia. 

To elucidate mechanisms of TKI resistance in tyrosine-kinase- 
driven leukaemia, we performed a gene expression analysis including 
our and published data of TKI-treated leukaemia. We identified BCL6 
as a top-ranking gene in a set of recurrent gene expression changes, 
some of which are shared with mitogen-activated protein-kinase 
kinase (MEK) inhibition in BRAFY°°°* mutant solid tumour cells!® 
(Supplementary Figs 2 and 3). TKI-induced upregulation of BCL6 
messenger RNA (mRNA) levels was confirmed in multiple leukaemia 
subtypes carrying oncogenic tyrosine kinases (Supplementary Fig. 2). 
The BCR-ABLI kinase, encoded by the Philadelphia chromosome 
(Ph), represents the most frequent genetic lesion in adult ALL, defines 
the subtype with a particularly poor prognosis'*” and was therefore 
chosen as focus for this study. 


To elucidate the regulation of BCL6 in Ph* ALL, we investigated the 
JAK2/STATS (ref. 11) and PI3K/AKT” pathways downstream of BCR- 
ABLI. We and others have shown that STAT5 suppresses BCL6 in B 
cells'?’. TKI-mediated upregulation of BCL6 was diminished by con- 
stitutively active STATS (Fig. 1b) and deletion of STAT5 was sufficient 
to upregulate BCL6, even in the absence of TKI treatment (Fig. 1c). In 
agreement with previous work"*, overexpression of FoxO4 induced 
BCL6 (Fig. 1d). In Ph* ALL cells, FoxO factors are inactivated by 
PI3K/AKT™ signalling, which is reversed by Pten (Supplementary 
Fig. 4). Deletion of Pten, hence, abrogated the ability of the leukaemia 
cells to upregulate BCL6 in response to TKI treatment (Fig. le). 

In DLBCL, BCL6 is frequently translocated and suppresses p53- 
mediated apoptosis’’*®. Although TKI treatment is less effective in 
p53‘ Ph* ALL”, recent studies showed that TKI paradoxically 
prevents the upregulation of p53 in response to DNA damage in 
Ph* ALL and chronic myelogenous leukaemia’*!°. A comparative 
gene expression analysis of BCL6 ‘~ and BCL6*’* ALL cells (Sup- 
plementary Fig. 5) identified Cdkn2a (Arf), Cdknla (p21), p53 and 
p53bp1 as potential BCL6 target genes (Supplementary Fig. 6). Arfand 
p53 protein levels were indeed unrestrained in BCL6 _‘~ ALL (Fig. 2a). 
TKI treatment of BCL6*’* ALL resulted in strong upregulation of 
BCL6 with low levels of p53, whereas BCL6~/~ ALL cells failed to curb 
p53 expression levels (Supplementary Fig. 7). Likewise, TKI treatment 
increased excessively p53 levels when Pten-deficient ALL cells failed to 
upregulate BCL6 (Fig. le). 

Identifying direct targets of BCL6 by chromatin immunoprecipita- 
tion (ChIP) in Ph* ALL (Supplementary Figs 8-11), p53, p21 and p27 
were among the genes with the strongest recruitment of BCL6 in TKI- 
treated ALL (Fig. 2b and Supplementary Figs 9-11). Given that cell- 
cycle arrest and senescence-associated genes were among the BCL6 
targets, we studied the cell-cycle profile of leukaemia cells. BCL6/~ 
ALL cells divided at a slightly reduced rate compared with BCL6‘/* 
ALL cells (Fig. 2c). Treatment with adriamycin (0.05 pig ml ') had no 
significant effect on BCL6*’* ALL cells in a senescence-associated 
B-galactosidase assay”! but revealed that most BCL6 ‘~ leukaemia 
cells were poised to undergo cellular senescence (Fig. 2d). These find- 
ings demonstrate that even low levels of BCL6 in the absence of TKI 
treatment are critical to downregulate Arf/p53. 

Clonal evolution of leukaemia involves acquisition of genetic lesions 
through DNA damage”. Interestingly, a comparative genomic hybrid- 
ization analysis revealed that genetic lesions were less frequent in BCL6 
deficient ALL (Supplementary Fig. 12), suggesting that increased sens- 
itivity to DNA damage limits clonal evolution in the absence of BCL6. 
Because Arf and p53 are critical negative regulators of self-renewal”, 
we performed colony-forming assays. The colony frequencies of 
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Figure 1 | Regulation of BCL6 expression in BCR-ABLI ALL cells. a, Ph~ 
ALL cells were treated with and without imatinib (10 ,moll~') for 24h. 
Upregulation of BCL6 was compared with expression levels in DLBCL by 
western blot. b, BCR-ABL1-transformed mouse ALL cells were transduced 
with a constitutively active StatS mutant (STAT5“) or a control vector (green 
fluorescent protein, GFP) and treated either with or without imatinib. BCL6 
western blot was performed using B-actin as loading control. c, BCL6 
expression upon imatinib treatment was studied by western blot in the presence 
or absence of Cre-mediated deletion of Stat5 in BCR-ABL1-transformed 
Stat5"" mouse ALL. d, Mouse BCR-ABL1 ALL cells were transduced with 
FoxO4-puromycin or a puromycin control vector and subjected to antibiotic 
selection. Cells were collected and BCL6 mRNA levels were measured by qRT- 
PCR relative to Hprt. e, Imatinib-induced BCL6 expression was studied by 
western blot in the presence or absence of Cre-mediated deletion of Pten in 
BCR-ABLI-transformed Pten™" mouse ALL cells. 


BCL6 ‘~ ALL cells were reduced by approximately 20-fold compared 
with BCL6*’* ALL (Fig. 3a). To study self-renewal in vivo, we mea- 
sured the ability of BCL6*/* and BCL6~‘~ ALL cells to initiate leuk- 
aemia in transplant recipients (Fig. 3b). Using luciferase bioimaging, 
leukaemia engraftment was observed in both groups after 8 days. 
BCL6*’* ALL cells rapidly expanded and initiated fatal leukaemia, 
whereas BCL6‘~ ALL cells failed to expand from the initial engraft- 
ment foci (Fig. 3c). Some mice that received BCL6 /~ ALL cells 
ultimately succumbed to leukaemia (Fig. 3b). Flow cytometry, 
however, revealed that the leukaemias in the BCL6~/~ group were 
in fact derived from endogenous CD45.1° cells of the irradiated 
recipients and not from the injected CD45.2~ donor ALL cells (Sup- 
plementary Fig. 13 and asterisks in Fig. 3b). 

Defective leukaemia initiation may be a consequence of impaired 
homing to the bone marrow niche. Indeed, BCL6 /~ ALL cells lack 
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expression of CD44 (Supplementary Fig. 14), which is critical for 
homing of BCR-ABL1 LICs to the bone marrow microenvironment”. 
Retroviral reconstitution of CD44 markedly increased homing of 
BCL6 ‘~ ALL cells to the bone marrow niche, but failed to rescue 
defective leukaemia initiation (Supplementary Fig. 14). 

Using intrafemoral injection to circumvent homing defects, a limiting 
dilution experiment (Fig. 3d) showed that 5 million BCL6 /~ ALL cells 
compared with only 10° BCL6*’* ALL cells were needed to initiate 
fatal leukaemia. These findings suggest that the frequency of LIC in 
BCL6 ‘~ ALL (fewer than 1 in 100,000) is reduced by more than 
100-fold compared with BCL6*’* ALL (atleast 1 in 1,000). An alterna- 
tive interpretation would be that LICs occur at a similar frequency in 
BCL6 ‘~ ALL but with reduced self-renewal activity. To address 
potential ‘exhaustion’ of LICs, we performed a serial transplantation 
with ALL cells that gave rise to disease in primary recipients after 
injection of 5 million ALL cells. From the bone marrow, we isolated 
CD19* ALL cells for secondary intrafemoral injection. BCL6‘~ leuk- 
aemia was not transplantable in secondary recipients (Supplementary 
Fig. 15). Although these findings do not exclude the possibility that the 
LIC frequencies are reduced in BCL6 ‘~ ALL, they support the notion 
of LIC ‘exhaustion’ after secondary transplantation. 

To explore the therapeutic usefulness of pharmacological inhibition 
of BCL6, we tested a BCL6 inhibitor (retro-inverso BCL6 peptide- 
inhibitor (RI-BPI)), which blocks the repressor activity of BCL6 
(ref. 25). Gene expression analysis confirmed that RI-BPI is a selective 
and potent inhibitor of BCL6 (Supplementary Fig. 16). We investi- 
gated the effect of RI-BPI on the self-renewal capacity of primary Ph* 
ALL and the initiation of leukaemia in a mouse xenograft model. 
Treatment with RI-BPI resulted in a reduction of colony formation 
and delayed progression of leukaemia. Likewise, treatment of Ph’ ALL 
with RI-BPI induced cellular senescence (Supplementary Fig. 17). 

We next examined how gene dosage of BCL6 affects responses to 
TKI. For instance, Pten”'~ ALL cells lack the ability to upregulate the 
p53-repressor BCL6 and are more sensitive to imatinib (Fig. le and 
Supplementary Fig. 18). Dose-response studies in BCL6*/*, BCL6*/~ 
and BCL6 ‘~ ALL (Fig. 4a) showed that sensitivity to imatinib was 
significantly increased in BCL6‘~ (half maximal effective concentra- 
tion (ECso) 0.17 umoll~') and even in BCL6‘’~ ALL cells (ECso 
0.67 umol1 +) compared with BCL6*!* ALL cells (ECso 1.10 umol1'; 
Fig. 4a). These findings indicate that maximum levels of BCL6 expres- 
sion are required to prevent TKI-induced cell death. Indeed, inducible 
activation of BCL6-ER'* constructs” in BCL6 ‘~ ALL cells conferred a 
strong survival advantage in the presence of imatinib (Fig. 4b). 
Activation of BCL6 in BCL6*’* ALL cells induced cell-cycle exit (not 
shown) and no additional survival advantage, because these cells 
already achieved maximal upregulation of endogenous BCL6 (Fig. 1a). 

To address the role of BCL6-mediated repression of p53 in TKI- 
resistance, p53 /~ and p53*’* ALL cells were treated with RI-BPI. The 
synergistic effect between TKI treatment and RI-BPI is indeed partly 
p53 dependent (Supplementary Fig. 19).In p53 ‘~ ALL cells, the effect 
of RI-BPI was significantly diminished compared with p53*/* ALL. 

To confirm that BCL6 has a similar function in patient-derived Ph* 
ALL, primary ALL cells were transduced with a dominant-negative 
BCL6 mutant (DN-BCL6-ER'”)”*, which resulted in a marked com- 
petitive disadvantage of Ph ALL cells, that was further enhanced by 
imatinib treatment (Fig. 4c). Similar observations in mouse ALL and in 
an established Ph* ALL cell line demonstrated that BCL6 promotes 
survival of TKI-treated Ph* ALL (Supplementary Fig. 20). 

To test the effect of BCL6 inhibition on TKI resistance, we cultured 
four primary Ph* ALL in the presence or absence of imatinib, RI-BPI 
or a combination of both (Supplementary Fig. 21). Initially, all four 
Ph* ALL cases responded to imatinib treatment, but subsequently 
rebounded and were no longer sensitive to imatinib (10 umol17'). 
RI-BPI alone showed only slight effects, whereas the combination of 
RI-BPI and imatinib rapidly induced cell death and effectively pre- 
vented a rebound in all four cases (Supplementary Fig. 21). These 
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Figure 2 | BCL6 is required for transcriptional inactivation of the Arf/p53 
pathway in BCR-ABL1 ALL. a, Western blot analysis of CDKN2A (Arf) and 
p53 expression in BCL6 ‘~ and BCL6‘/* BCR-ABLI ALL cells. b, Human 
Ph* ALL cells (Tom1) were treated with and without imatinib (10 umol 17") for 
24h and were subjected to ChIP-on-chip analysis using a BCL6-specific 
antibody. The y axis indicates enrichment versus input, the x axis the location of 
probes within the respective loci relative to the transcriptional start site. The 


findings suggest that prolonged treatment with a combination of 
imatinib/RI-BPI prevents acquisition of TKI-resistance. We next 
examined the effect of imatinib/RI-BPI combinations on primary 
TKI-resistance in Ph* ALL. To this end, four human Ph* ALL cell 
lines that lacked BCR-ABL1 kinase mutations (Supplementary Table 1) 
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Figure 3 | BCL6 is required for leukaemia initiation in BCR-ABL1 ALL. 
a, Ten thousand BCL6 /~ or BCL6*/* BCR-ABLI ALL cells were plated in 
semisolid agar, and colonies were counted after 10 days (numbers denote 
means + SD, n = 3). b, Overall survival of mice injected with 100,000 BCL6 /~ 
and BCL6*/* BCR-ABL1 ALL cells was compared by Kaplan-Meier analysis. 
Mice that developed CD45.1* endogenous leukaemia instead of leukaemia 
from injected CD45.2* cells are indicated by asterisks (see Supplementary 
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P= 0.0008 


dark and light green (control) or red (imatinib) tracings depict two replicates. 
Recruitment to CDKNIA, CDKN1B, TP53 and HPRT (negative control) is 
shown in Ph* ALL cells and one DLBCL cell line (OCI-Ly7). ¢, Cell-cycle 
analysis (BrdU/7-AAD staining). d, Staining for senescence-associated 
B-galactosidase (SA-f-gal). ALL cells were treated with or without 0.05 ug ml" 
adriamycin for 48 h to induce a low level of DNA damage. Percentages of SA-B- 
gal* cells are indicated (means + SD; n = 3). 


but which were highly refractory to imatinib (10 mol 1!) were treated 
with or without imatinib, RI-BPI or a combination of both. Imatinib 
alone did not achieve a therapeutic response, whereas the combination 
with RI-BPI potentiated the effect of imatinib on the refractory ALL 
cells (Supplementary Fig. 22). 
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Fig. 13). c, Foran SCID LIC (SL-IC) experiment, BCL6 ‘~ and BCL6*/* BCR- 
ABL1 ALL cells were labelled with firefly luciferase and intravenously injected 
into sublethally irradiated NOD/SCID mice. d, The SL-IC assay was repeated as 
a limiting dilution experiment (10°, 10%, 10°, 5 million cells) and leukaemia cells 
were directly injected into the femoral bone marrow to circumvent potential 
engraftment defects. 
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Figure 4 | BCL6 promotes survival of TKI-treated BCR-ABLI ALL cells. 
a, Imatinib sensitivity of BCL6/~, BCL6*’~ and BCL6*’* ALL cells was 
measured in a resazurin viability assay. b, BCL6 ‘~ ALL cells were transduced 
with BCL6-ER™ or ER” vectors (tagged with GEP). ALL cells were treated with 
or without 1 umoll~? imatinib, and BCL6-ER?” or ER’? were induced by 
4-hydroxytamoxifen. Relative changes of GFP* cells after induction are 
indicated. c, Patient-derived Ph* ALL cells (ICN1) were transduced with 
inducible dominant-negative BCL6 (DN-BCL6-ER™) or ER™ control vectors. 
ALL cells were treated with or without 10 ,umoll~' imatinib and DN-BCL6- 
ER™ or ER™ were induced by 4-hydroxytamoxifen. Relative changes of GFP* 
cells after induction are indicated. d, Patient-derived Ph* ALL cells (TXL2) 
were labelled with luciferase and 100,000 cells were injected. Mice were treated 
seven times with either vehicle (green), nilotinib (25 mgkg ‘5 grey) or a 
combination of nilotinib and RI-BPI (25mgkg '; red). Treated mice are 
shown in e, a Kaplan-Meier survival analysis. Treatment days are indicated by 
arrowheads. 
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To study the efficacy of combined tyrosine kinase and BCL6 inhibi- 
tion in vivo, primary Ph* ALL cells were labelled with luciferase and 
xenografted into mice. Recipient mice were treated with either vehicle, 
nilotinib or a combination of nilotinib and RI-BPI. Nilotinib is more 
potent than imatinib, which only has marginal effects in mice’’”*. 
Bioimaging demonstrated that seven to ten injections of RI-BPI sig- 
nificantly enhanced the effect of nilotinib (Fig. 4d, e and Supplemen- 
tary Fig. 23). Whereas all mice treated with nilotinib alone succumbed 
to leukaemia within 99 days after injection, seven of eight mice treated 
with RI-BPI/nilotinib combination were still alive after 140 days 
(Fig. 4d, e). Also, in a model for full-blown mouse leukaemia, TKI/ 
RI-BPI combinations proved effective and significantly prolonged sur- 
vival (Supplementary Fig. 24). Establishing a potential therapeutic 
window of nilotinib/RI-BPI combinations, we found no evidence of 
relevant toxicity (Supplementary Figs 25 and 26 and Supplementary 
Table 2). 

Although transcription factors have been considered intractable 
therapeutic targets, the recent development of a small molecule inhibi- 
tor against BCL6 (ref. 29) holds promise for effectively targeting TKI- 
resistance in patients with Ph* ALL. Because TKI-resistance develops 
in virtually all cases of Ph* ALL, it appears particularly important to 
target this novel pathway of TKI-resistance. 


METHODS SUMMARY 


Cell culture. Primary leukaemia cells (Supplementary Table 1) were cultured on 
OP9 stroma cells in alpha minimum essential medium without ribonucleotides 
and deoxyribonucleotides, supplemented with 20% FBS, 2 mM L-glutamine, 1 mM 
sodium pyruvate, 1001U ml‘ penicillin and 100 jg ml ' streptomycin. Human 
ALL cell lines were maintained in RPMI with GlutaMAX containing 20% FBS, 
1001Uml~' penicillin and 100pgml~' streptomycin. Mouse BCR-ABLI- 
transformed ALL cells were maintained in IMDM with GlutaMAX contain- 
ing 20% FBS, 100IUml ' penicillin, 1004gml~' streptomycin and 501M 
2-mercaptoethanol. Cell cultures were kept at 37°C in a humidified incubator 
under a 5% CO atmosphere. 

BCR-ABLI transformation. Transfection of a murine stem cell virus (MSCV)- 
based retroviral vector encoding BCR-ABL1 was performed using Lipofectamine 
2000. Retroviral supernatant was produced by co-transfecting 293FT cells with the 
plasmids pHIT60 and pHIT123. Virus supernatant was collected, filtered through a 
0.45 um filter and loaded by centrifugation (2,000g, 90 min at 32 °C) on 50 1g ml 
RetroNectin-coated non-tissue well plates. Extracted bone marrow cells from mice 
were transduced by BCR-ABL1 in the presence of 10ngml ' recombinant murine 
interleukin-7 in RetroNectin-coated Petri dishes. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Patient samples, human cells and cell lines. Patient samples (Supplementary 
Table 1) were provided from the departments of Hematology and Oncology, 
University Hospital Benjamin Franklin, Berlin, Germany (W.-K.H.) and the 
USC Norris Comprehensive Cancer Center in compliance with Institutional 
Review Board regulations (approval from the Ethik-Kommission of the Charité, 
Campus Benjamin Franklin and the IRB of the University of Southern California 
Health Sciences Campus). Leukaemia cells from bone marrow biopsy of patients 
with Ph* ALL were xenografted into sublethally irradiated NOD/SCID mice by 
tail vein injection. After passaging, leukaemia cells were collected and cultured on 
OP9 stroma cells in alpha minimum essential medium (Alpha-MEM, Invitrogen) 
without ribonucleotides and deoxyribonucleotides, supplemented with 20% fetal 
bovine serum, 2 mmol]7! L-glutamine, 1 mmol 1! sodium pyruvate, 1001U ml~ : 
penicillin and 100,1gml~' streptomycin. The human ALL cell lines BV173, 
NALM-1, SUP-B15 and TOMI (obtained from Deutsche Sammlung von 
Microorganismen und Zellkulturen (DSMZ)) were maintained in Roswell Park 
Memorial Institute medium (RPMI-1640, Invitrogen) with GlutaMAX containing 
20% fetal bovine serum, 100IU ml“! penicillin and 100 pg ml”! streptomycin. 
Retroviral constructs and transduction. Transfection of retroviral constructs 
encoding BCR-ABL1-IRES-GFP*, BCR-ABL1-IRES-Neo, STAT5-CA*’, CD44S- 
Puro®, FoxO4-Puro, BCL6-ER"*-GFP*’, ER'?-GFP, DN-BCL6-ER'?-GFP, Cre- 
ER’?-Puro*, Cre-IRES-GEFP, Puro-, Neo- and GFP-empty vector controls were 
performed using Lipofectamine 2000 (Invitrogen) with Opti-MEM media 
(Invitrogen). Retroviral supernatant was produced by co-transfecting HEK 293FT 
cells with the plasmids pHIT60 (ref. 34) (gag-pol) and pHIT123 (ecotropic env) or 
pHIT456 (amphotropic env). 293FT cells were cultured in high glucose Dulbecco’s 
modified Eagle’s medium (DMEM, Invitrogen) with GlutaMAX containing 10% 
fetal bovine serum, 1001U ml! penicillin, 100 pg ml! streptomycin, 25 mmoll t 
HEPES, 1 mmol1~' sodium pyruvate and 0.1 mmol1~! non-essential amino acids. 
Regular media were replaced after 16h by growth media containing 10 mmol] ' 
sodium butyrate. After incubation for 8 h, the media were changed back to regular 
growth media. Twenty-four hours later, the virus supernatant was collected, filtered 
through a 0.45 jum filter and loaded by centrifugation (2,000g, 90 min at 32 °C) two 
times on 50 1g ml! RetroNectin- (Takara) coated non-tissue six-well plates. Two 
million to three million cells were transduced per well by centrifugation at 500g for 
30min and maintained for 48h at 37°C with 5% CO, before transferring into 
culture flasks. Transduced cells with oestrogen receptor fusion proteins were 
induced with 4-hydroxytamoxifen (500 nM). 

In vivo model for BCR-ABL1-transformed ALL and bioluminescence 
imaging. After cytokine-independent proliferation, BCR-ABL1-transformed 
ALL cells were labelled with a lentiviral vector encoding firefly luciferase with a 
neomycin selection marker. After selection with 0.5-2 mg ml ' G418 for 10 days, 
luciferase-labelled ALL cells were injected into sublethally irradiated (250 cGy) 
NOD/SCID mice. Human primary leukaemia cells were transduced with a lenti- 
viral firefly luciferase carrying a GFP marker. After expansion of sorted GFP* 
cells, 1 X 10° cells were injected through the tail vein into sublethally irradiated 
NOD/SCID mice. Bioimaging of leukaemia progression in mice was performed at 
different time points using an in vivo IVIS 100 bioluminescence/optical imaging 
system (Xenogen). D-Luciferin (Promega) dissolved in PBS was injected intraper- 
itoneally at a dose of 2.5 mg per mouse 15 min before measuring the luminescence 
signal. General anaesthesia was induced with 5% isoflurane and continued during 
the procedure with 2% isoflurane introduced through a nose cone. All mouse 
experiments were subject to institutional approval by the Children’s Hospital 
Los Angeles Institutional Animal Care and Use Committee. 

Extraction of bone marrow cells from mice. To avoid inflammation-related 
effects in BCL6 ‘~ mice*’, bone marrow cells were extracted from young age- 
matched BCL6*’* and BCL6 ‘~ mice (younger than 6 weeks of age) without signs 
of inflammation. Bone marrow cells were obtained by flushing cavities of femur 
and tibia with PBS. After filtration through a 704m filter and depletion of 
erythrocytes using a lysis buffer (BD PharmLyse, BD Biosciences), washed cells 
were either frozen for storage or subjected to further experiments. 

BCLo ", Stat5™ io Pten™" and p53 / ~ mice. A summary of mouse strains used 
in this study is provided in Supplementary Table 3. Bone marrow cells from 
BCL6 /~ (generated in R. Dalla-Favera’s laboratory)”*, Stat (generated in L. 
Henninghausen’s laboratory)’, Pten™" (generated in H. Wu's laboratory)** and 
p53 ‘~ (obtained from Jackson Laboratory) mice were collected and retrovirally 
transformed by BCR-ABLI (ref. 30) in the presence of 10 ng interleukin-7 per 
milliliter (Peprotech) in RetroNectin- (Takara) coated Petri dishes as described 
below. All BCR-ABL1-transformed ALL cells derived from bone marrow of mice 
were maintained in Iscove’s modified Dulbecco’s medium (IMDM, Invitrogen) 
with GlutaMAX containing 20% fetal bovine serum, 100 IU ml Y penicillin, 100 pg 
ml! streptomycin and 50 1M 2-mercaptoethanol. BCR-ABL1-transformed ALL 
cells were propagated only for short periods of time and usually not longer than for 
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2 months to avoid acquisition of additional genetic lesions during long-term cell 
culture. 

RI-BPI. Homo-dimerization of the amino (N)-terminal Broad Complex, 
Tramtrack, Bric 4 brac (BTB) domain of BCL6 forms a lateral groove motif, which 
is required to recruit co-repressor proteins such as BCL6 co-repressor (BCoR), 
nuclear receptor co-repressor (N-CoR) and silencing mediator of retinoid and 
thyroid receptors (SMRT). BCoR, NCoR and SMRT interact in a mutually exclus- 
ive manner with an 18-amino-acid motif in the lateral groove of the BCL6 BTB 
domain to form a BCL6 repression complex’. A recombinant peptide contain- 
ing the SMRT BBD (BCL6-binding domain) along with a cell-penetrating TAT 
domain was able to inhibit the transcriptional repressor activity of BCL6"’. Based 
on this initial work, the peptidomimetic molecule RI-BPI with superior potency 
and stability was developed” and used for BCL6-inhibition. RI-BPI represents a 
retro-inverso TAT-BBD-Fu (fusogenic) peptide? that was synthesized by 
Biosynthesis Inc. (Lewisville, TX) and stored lyophilized at —20°C until recon- 
stituted with sterile, distilled, degassed water immediately before use. The purity 
determined by high-performance liquid chromatography—mass spectrometry was 
95% or higher. RI-BPI was injected intraperitoneally into mice. 

BCR-ABLI TKI. Imatinib (STI571) and nilotinib (AMN107) were obtained from 
Novartis Pharmaceuticals or from LC Laboratories. Stock solutions of imatinib 
were prepared in sterile, distilled water at 10mmoll~' and stored at —20°C. 
Nilotinib was either dissolved in DMSO (dimethyl sulphoxide) or NMP (N- 
Methyl-2-pyrrolidone) just before administration. Nilotinib dissolved in DMSO 
was vortexed with four volumes of peanut butter until a homogeneous mixture was 
formed. Nilotinib (free base) solubilized in NMP was diluted with PEG 300 (poly- 
ethylene glycol 300) in a 10/90 (vol/vol) ratio. Cohorts of mice were treated with 
oral administration of vehicle or nilotinib (25mgkg ‘day * or 50mgkg * 
day‘) once daily at indicated time points. 

Clonality analysis and spectratyping of B-cell populations. Immunoglobulin 
Vu-DJy gene rearrangements were amplified using PCR primers specific for the 
J558 Vy, region gene with a primer specific for the Cu constant region gene. Using 
a FAM-conjugated Cyt constant region or a Jy; gene-specific primer in a run-off 
reaction, PCR products were labelled and subsequently analysed on a capillary 
sequencer (ABI3100, Applied Biosystems) by fragment-length analysis. Sequences 
of primers used are given in Supplementary Table 4. 

Affymetrix GeneChip analysis. Total RNA from cells used for microarray or RT- 
PCR analysis was isolated by RNeasy (Qiagen) purification. RNA quality was first 
checked by using an Agilent Bioanalyser (Agilent Technologies). Complementary 
DNA (cDNA) was generated from 5 yg of total RNA using a poly(dT) oligonu- 
cleotide containing a T7 RNA polymerase initiation site and the SuperScript III 
Reverse Transcriptase (Invitrogen). Biotinylated CRNA was generated and frag- 
mented according to the Affymetrix protocol and hybridized to U133A 2.0 human 
or 430 mouse microarrays (Affymetrix). After scanning (GeneChip Scanner 3000 
7G, Affymetrix) of the GeneChip arrays, the generated CEL files were imported to 
BRB Array Tool (http://linus.nci.nih.gov/BRB-ArrayTools.html) and processed 
using the RMA algorithm (Robust Multi-array Average) for normalization and 
summarization. Relative signal intensities of probe sets were determined by com- 
paring the signal intensity from TKI-treated and untreated cells to the average 
signal value of the respective cell line or a group of cell lines. The calculated signal 
ratios of probe sets were visualized as a heatmap with Java Treeview. 

Target validation of RI-BPI in human Ph* ALL cells. Ph* ALL cell lines 
(BV173, NALM1 and TOM1) were treated with vehicle (control), 10 umoll”? 
imatinib or imatinib + 20ymoll~' RI-BPI for 24h and maintained in 
Allprotect (Qiagen) at —80°C until RNA isolation using an RNeasy Plus kit 
(Qiagen). RNA integrity was determined using the RNA 6000 Nano LabChip 
kit on Agilent 2100 Bioanalyser (Agilent Technologies). Two independent samples 
were analysed for each condition. RNA (1 ig) was hybridized to Agilent 60-mer 
Whole Human Genome Microarrays (part number G4112A) according to the 
manufacturer’s recommendations. After hybridization, the processed microarrays 
were scanned with the Agilent DNA microarray scanner (part number G2505C) 
and extracted with Agilent Feature Extraction software version 8.5 (GE1- 
v5_10_Apr08). For computational analysis of signal, we used the dye-normalized 
signal after surrogate algorithm (gProcessedSignal) extracted from the .txt files and 
process for each array and for all the probes. This value was subjected to log, 
transformation and median array normalization. The fold changes of imatinib 
compared with control and (imatinib + RI-BPI) compared with imatinib were 
calculated for each cell line and for each gene. A data set containing previously 
identified BCL6 target genes (obtained from Nimblegen arrays) was mapped into 
the Agilent probe sets using the Agilent and NimbleGen array annotation files. To 
determine if two data sets differed significantly, we compared the fold change in 
BCL6 target genes with the fold change in BCL6 non-target genes for each data set 
(imatinib compared with control, and imatinib + RI-BPI compared with imati- 
nib) by the Kolmogorov-Smirnov test’. The Kolmogorov-Smirnov test deter- 
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mines if two data sets (gene expression values for BCL6 target genes and non- 
target genes) differ significantly. Heat maps and other analysis were obtained using 
the R statistical software (http://www.r-project.org). 

ChIP-on-chip analysis. ChIPs were performed with modifications as described’. 
Briefly, 2.5 X 10’ Ph* ALL cell lines (BV173, NALM1 and TOM1) were treated with 
or without 10 jmoll~! imatinib for 24h. Then the cells were double cross-linked 
with 2 mmoll”! EGS cross linker and 1% formaldehyde. After sonication, immu- 
noprecipitations were performed using 5 jig BCL6 (N3, Santa Cruz Biotechnology) 
or control IgG antibody (Sigma-Aldrich) from the chromatin fragments of 2.5 X 10” 
human Ph* ALL cells. After validation of enrichment by Q-ChIP, BCL6 or control 
IgG, ChIP products and their respective input genomic fragments were amplified by 
ligation-mediated PCR. The products were co-hybridized with the respective input 
samples to NimbleGen promoter arrays (human genome version 35). Quantitative 
ChIP was performed again at this stage for selected positive control loci to verify that 
the enrichment ratios were retained. The genomic products of two biological ChIP 
replicates were labelled with Cy5 (for ChIP products) and Cy3 (for input) and co- 
hybridized on custom-designed genomic tiling arrays generated by NimbleGen 
Systems. These high-density tiling arrays contained 50-residue oligonucleotides with 
an average overlap of 25 bases, omitting repetitive elements. After hybridization, the 
relative enrichment for each probe was calculated as the signal ratio of ChIP to input. 
Peaks of enrichment for BCL6 relative to input were captured with a five-probe 
sliding window, and the results were uploaded as custom tracks into the University of 
California Santa Cruz genome browser and graphically represented as histograms. 
Two replicates were performed with each condition. 

Data analysis of ChIP-on-chip experiments. To identify target genes of BCL6 in 
these experiments, we computed the log-ratio between the probe intensities of the 
ChIP product and input and took moving averages of log-ratio of three neigh- 
bouring probes and determined the maximum value for each gene promoter and 
the random permutation probes as background control“*. The cut-off for each 
array was established as higher than the 99th percentile of the 24,175 log-ratio 
values generated from random permutation probes. A locus with maximum mov- 
ing average above cut-offs in two replicates was considered a potential binding site. 
Because this high stringent-overlapping approach can produce a high false-nega- 
tive rate, we also computed the correlations among peaks between the replicates to 
rescue promoters that did not pass cut-off in one replicate. We calculated the 
Pearson correlation coefficient of the probe’s signal of the promoter between 
replicates, and promoters with a correlation higher that 0.8 were rescued and 
included in our final set of BCL6 targets. In addition, all peaks were mapped back 
to the genome using BLAT (the BLAST-like Alignment Tool, http://genome. 
ucsc.edu) to identify genes on opposite strands that could be regulated from the 
same bidirectional promoter. Two genes were considered to be bidirectional partners 
when they were located on the opposite strands in a ‘head-to-head’ orientation and 
their transcription start sites were separated by less than 0.5 kilobases. 
Comparative genomic hybridization. To analyse genetic instability and acquisi- 
tion of genetic lesions during long-term cell culture, genomic DNA of BCR-ABL1- 
transformed BCL6*/* and BCL6 /~ ALL cells was extracted after culturing for 
4months. Genomic DNA was isolated using the PureLink genomic DNA kit 
(Invitrogen). Three samples of each ALL type were co-hybridized with genomic 
DNA extracted from normal untransformed mouse cells to NimbleGen mouse 
720k Whole-Genome Tiling arrays (NimbleGen Systems) in accordance with the 
manufacturer’s recommendations. Copy number variations were analysed using 
the FASST-segmentation algorithm in Nexus software (BioDiscovery). Copy- 
number analysis was performed using a significance threshold of 1 x 10” and 
a log, ratio cut-off at +0.2 for regions sized 1,000 kilobase pairs. 
Senescence-associated f-galactosidase assay. Senescence-associated [-galactosi- 
dase activity was performed on cytospin preparations as described’’. Briefly, a fix- 
ative solution (0.25% glutaraldehyde, 2% paraformaldehyde in PBS pH 5.5 for mouse 
cells and pH 6 for human cells) was freshly generated. To this end, 1 g paraformalde- 
hyde was dissolved in 50 ml PBS at pH 5.5 by heating followed by addition of 250 ul 
of a 50% stock glutaraldehyde solution. 1 X-gal staining solution was prepared as 
follows (10 ml): 9.3ml PBS/MgCh, 0.5ml 20x KC solution (that is, 820mg 
K;Fe(CN), and 1,050mg K,Fe(CN)s * 3H,O in 25ml PBS) and 0.25 ml 40x 
X-gal (that is, 40 mg 5-bromo-4-chloro-3-indolyl B-p-galactoside per milliliter of 
N,N-dimethylformamide) solution were mixed. For BCR-ABL1-transformed ALL 
cells, 100,000 cells per cytospin were used (700 r.p.m., 8 min). The fixative solution 
was pipetted on the cytospins and incubated for 10 min at room temperature, then 
washed twice for 5 min in PBS/MgCh. Cytospin preparations were submerged in 1X 
X-gal solution, incubated overnight at 37 °C in a humidified chamber and washed 
twice in PBS. Slides were mounted before they dried. 

Western blotting. Cells were lysed in CelLytic buffer (Sigma) supplemented with 
1% protease inhibitor cocktail (Pierce). Ten micrograms of protein mixture per 
sample were separated on NuPAGE (Invitrogen) 4-12% Bis-Tris gradient gels and 
transferred on PVDF membranes (Immobilion, Millipore). To detect mouse and 


human proteins by western blot, primary antibodies were used with the 
WesternBreeze immunodetection system (Invitrogen). The following antibodies 
were used: human BCL6 (clones D8 and N3, Santa Cruz Biotechnology), mouse 
BCL6 (rabbit polyclonal, Cell Signaling Technology), Arf (4C6/4, Cell Signaling 
Technology), p53 (1C12, Cell Signaling Technology), PTEN (A2B1, Santa Cruz), 
global Stat5 (3H7, Cell Signaling Technology) and phospho-Y694 Stat5 (14H2, 
Cell Signaling Technology). Antibodies against B-actin were used as a loading 
control (C4, Santa Cruz). 

Flow cytometry. Antibodies against mouse CD19 (1D3), B220 (RA3-6B2), CD3 
(17A2), CD43 (S7), CD45.1 (A20), CD45.2 (104), CD44 (IM7 and G44-26) and c-Kit 
(2B8) as well as respective isotype controls were purchased from BD Biosciences. For 
apoptosis analyses, Annexin V, propidium iodide and 7-AAD were used (BD 
Biosciences). 

Cell viability assay. Fifty thousand BCR-ABL1-transformed ALL cells per well were 
seeded in a volume of 1001 B-cell medium on Optilux 96-well plate (BD 
Biosciences). Imatinib was diluted in medium and added at the indicated concentra- 
tion in a total culture volume of 150 jl. After culturing for 3 days, 15 pl of Resazurin 
(R&D) was added on each well and incubated for 4h at 37 °C. The fluorescence was 
read at 535 nm and the reference wavelength was 590 nm. Fold changes were calcu- 
lated using baseline values of untreated cells as a reference (set to 100%). 
Colony-forming assay. The methylcellulose colony-forming assays were per- 
formed with 10,000 BCR-ABLI-transformed mouse BCL6 /~ or BCL6*’* or 
10,000 human BCR-ABL1 ALL cells. Cells were re-suspended in MethoCult med- 
ium (StemCell Technologies) and cultured on dishes (3 cm diameter) with an extra 
water supply dish to prevent evaporation. After 7-14 days, colonies were counted. 
Cell-cycle analysis. For cell-cycle analysis in BCR-ABL1 ALL cells, the BrdU flow 
cytometry kit for cell-cycle analysis (BD Biosciences) was used according to manu- 
facturer’s instructions. BrdU incorporation (APC-labelled anti-BrdU antibodies) was 
measured with DNA content (7-amino-actinomycin-D) in fixed and permeabilized 
cells. The analysis was gated on viable cells that were identified based on scatter 
morphology”. 

In vivo toxicology studies of RI-BPI/nilotinib combinations. Fifteen adult male 
C57BL/6 mice were purchased from the National Cancer Institute and rando- 
mized in three groups of five. One group was exposed to intraperitoneal admin- 
istration of RI-BPI 20mgkg ' of body weight three times a week; the second 
group was treated with RI-BPI 20 mgkg | of body weight three times a week plus 
nilotinib 25 mgkg ' of body weight three times a week by oral gavage. A third 
group of five mice was treated with vehicle and used as controls. The mice were 
observed, examined and weighed every other day during the treatment period. 
Blood was collected at the end of the treatment by retro-orbital bleeding under 
anaesthesia. All mice were euthanized by CO, aspiration and the organs were 
harvested, weighed and macroscopically examined. Histology sections were pre- 
pared with haematoxylin and eosin staining. Pictures were taken using a digital 
camera (Olympus DP72) attached to a light microscope (Axioskop, Carl Zeiss) 
with <4 and X20 Plan Neofluar objectives (Carl Zeiss). 
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Dual functions of Tet] in transcriptional regulation in 
mouse embryonic stem cells 


Hao Wu't*, Ana C. D’Alessio**, Shinsuke Ito*, Kai Xia”, Zhibin Wang’, Kairong Cui‘, Keji Zhao*, Yi Eve Sun’ & Yi Zhang” 


Epigenetic modification of the mammalian genome by DNA methy- 
lation (5-methylcytosine) has a profound impact on chromatin 
structure, gene expression and maintenance of cellular identity’. 
The recent demonstration that members of the Ten-eleven trans- 
location (Tet) family of proteins can convert 5-methylcytosine to 
5-hydroxymethylcytosine raised the possibility that Tet proteins are 
capable of establishing a distinct epigenetic state**°. We have 
recently demonstrated that Tet1 is specifically expressed in murine 
embryonic stem (ES) cells and is required for ES cell maintenance’. 
Using chromatin immunoprecipitation coupled with high- 
throughput DNA sequencing, here we show in mouse ES cells that 
Tet1 is preferentially bound to CpG-rich sequences at promoters of 
both transcriptionally active and Polycomb-repressed genes. 
Despite an increase in levels of DNA methylation at many Tet1- 
binding sites, Tet1 depletion does not lead to downregulation of all 
the Tet1 targets. Interestingly, although Tetl-mediated promoter 
hypomethylation is required for maintaining the expression of a 
group of transcriptionally active genes, it is also involved in repres- 
sion of Polycomb-targeted developmental regulators. Tet1 contri- 
butes to silencing of this group of genes by facilitating recruitment 
of PRC2 to CpG-rich gene promoters. Thus, our study not only 
establishes a role for Tetl in modulating DNA methylation levels 
at CpG-rich promoters, but also reveals a dual function of Tet1 in 
promoting transcription of pluripotency factors as well as parti- 
cipating in the repression of Polycomb-targeted developmental 
regulators. 

The Tet protein family includes three members (Tet1-3), all of 
which have the capacity to convert 5-methylcytosine (5mC) to 
5-hydroxymethylcytosine (ShmC) in a 2-oxoglutarate- and Fe(II)- 
dependent manner”’. Consistent with the relative enrichment of 
5hmC in ES cells, Tet1 is highly expressed in undifferentiated ES cells 
and Tet! messenger RNA levels decrease upon ES cell differenti- 
ation”*. Lentiviral-mediated depletion of Tetl in mouse E14 ES cells 
cultured under feeder-free conditions leads to phenotypic changes that 
include partial loss of alkaline phosphatase activity and SSEA1 
immunoreactivity, decreased self-renewal capacity and proliferation 
rate, downregulation of pluripotency factor Nanog and upregulation 
of differentiation genes (for example, lineage markers for trophector- 
derm and primitive endoderm in a subset of cells)”. Thus, Tet] may be 
required for mouse ES cell maintenance. 

To gain insights into the mechanism by which Tet1 contributes to ES 
cell function, we investigated the genome-wide distribution of Tet] in 
mouse ES cells by chromatin immunoprecipitation coupled with high- 
throughput DNA sequencing (ChIP-seq) using a highly specific Tet] 
antibody (Supplementary Fig. 1a). Analysis of replicate ChIP-seq experi- 
ments identified a total of 35,564 binding sites with high confidence 
(P<10 &, or false discovery rate (FDR) of 0.01) (Supplementary Fig. 
1b, c and Supplementary Table 1). In contrast, parallel experiments 


using rabbit IgG did not yield specific enrichment (Fig. 1b and Sup- 
plementary Fig. 1c). Moreover, ChIP-seq analysis also indicated that 
Tetl occupancy was generally reduced in fluorescence-activated cell 
sorting (FACS)-sorted Tet1-depleted ES cells (Supplementary Fig. 2a). 
ChIP followed by quantitative polymerase chain reaction (qPCR) ana- 
lysis further confirmed decreased Tet] occupancy on randomly 
selected Tet1-binding sites in response to Tet1 depletion (Supplemen- 
tary Fig. 2b). Most Tet1 binding sites are located in gene-rich euchro- 
matic regions, as 79.8% of all Tetl-bound loci are within intragenic 
regions or 5kb intergenic regions up- or downstream of annotated 
genes (Supplementary Fig. 3a, b). Similar to other CXXC zinc-finger- 
domain-containing proteins (for example, Cfp1 and Kdm2a)*”, Tet is 
enriched (86.6%) at CpG islands (Fig. la-c). Consistently, de novo motif 
discovery analysis® identified a CpG-rich sequence as the highest rank- 
ing motif within Tetl-bound regions (Fig. 1d). Quantification of CpG 
density within Tetl-binding loci indicated that, similar to Kdm2a 
(Supplementary Fig. 4a, b), Tetl occupancy positively correlates with 
CpG density (Supplementary Fig. 4c). Collectively, the above results 
indicate that Tet1 high-affinity binding sites are generally enriched for 
CpG-rich sequences. 

Because Tet proteins are capable of converting 5mC to 5hmC”’, we 
investigated the relationship between Tetl occupancy and DNA 
methylation in mouse ES cells using methylated DNA immunopreci- 
pitation coupled with mouse whole-genome tiling microarrays 
(MeDIP-chip). We found that DNA methylation is generally excluded 
from transcription start sites (TSSs) of Tetl-bound gene promoters 
(Fig. 2a, blue line in left panel). In contrast, Tetl-unbound gene pro- 
moters are frequently DNA methylated (Fig. 2a, red line in left panel). 
These results are consistent with previous studies demonstrating that 
CpG-rich gene promoters, where Tet] is enriched (Fig. 1), are generally 
hypomethylated’*. Further analysis indicates that CpG islands not 
bound by Tetl are associated with higher 5mC levels compared to 
Tetl-bound CpG islands (Fig. 2a, right panel). Thus, Tet] occupancy 
at gene promoters is inversely correlated to levels of DNA methylation. 

To investigate whether Tet1 is required for maintaining the hypo- 
methylated state at Tetl-bound regions, we analysed DNA methyla- 
tion profiles in Tetl-depleted ES cells and demonstrated that Tet 
deficiency led to a general increase in 5mC levels at both TSSs and 
genomic regions flanking the proximal promoters of CpG-rich genes 
(Fig. 2b, c and Supplementary Fig. 5a, b). An increase in 5mC levels 
was also detected within proximal promoter regions of a subset of 
CpG-poor gene promoters (Fig. 2b, c and Supplementary Fig. 5a, b). 
The observed 5mC changes in Tet1-depleted cells were not due to 
interarray variations as a co-hybridization strategy analysing biologic- 
ally independent replicates also revealed that the increase in 5mC levels 
induced by Tet1 deficiency was generally enriched at Tet1-binding 
sites (Fig. 2d and Supplementary Table 2). Locus-specific bisulphite 
sequencing confirmed that Tetl-binding sites and their surrounding 
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Figure 1 | Tetl is enriched at genomic regions with high-density CpG 
dinucleotides. a, Genome-wide occupancy of Tet! at all annotated gene 
promoters in ES cells (black, CpG-rich genes; red, CpG-poor genes). The 
enrichment of Tetl binding was determined by ChIP-seq analysis. Average 
Tetl binding measured by —log;9 (peak P values) in 200-bp bins is shown 
within genomic regions covering 5 kb up- and downstream of TSSs. 

b, Enrichment of Tetl (purple), Kdm2a (orange)* and H3K4me3 (green)? 
measured by ChIP-seq at representative genes in ES cells (black, CpG-rich; red, 
CpG-poor). ChIP-seq data are shown in reads per million with the y-axis floor 
set to 0.5 reads per million. Genomic regions with statistically significant 
enrichment of Tet1 binding (measured by —log; (peak P values); P< 10°) 


regions became more DNA methylated in response to Tet1 depletion 
(Supplementary Fig. 6). Collectively, these data suggest that Tetl 
binding is required for maintaining a DNA hypomethylated state at 
a large cohort of CpG-rich gene promoters. 

Previous studies have established a link between DNA methylation 
and histone methylation” ''. To explore a potential relationship 
between Tetl occupancy and histone modifications, we compared 
the binding profile of Tet1 with that of major histone modifications 
in mouse ES cells previously determined by ChIP-seq (Supplementary 
Table 3)'*"*. We found that histone H3 lysine 4 trimethylation 
(H3K4me3) is positively correlated to Tet1 binding at gene promoters, 
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are also indicated. c, Heatmap representation of genomic regions with high- 
density CpG sites (CpG islands), binding profiles of Tetl, Kdm2a* and 
H3K4me3" in ES cells at all annotated mouse genes promoters (5 kb flanking 
TSSs of Refseq genes). The heatmap is rank-ordered from genes with CpG 
islands of longest length to no CpG islands within 5-kb genomic regions 
flanking TSSs. The presence of CpG islands is shown in colour (blue, present; 
white, absent). ChIP-seq enrichment was measured by —logi (peak P values) 
and is shown by colour scale. The following colour scales (white, no 
enrichment; blue, high enrichment) were used for Tet1/Kdm2a and H3K4me3 
respectively: (0, 50) and (0, 200). d, A DNA motif that is enriched in Tet1- 
bound loci in ES cells. 


as 71.3% of all Tetl-binding sites (mn = 25,359) overlapped with 
H3K4me3 peaks (Fig. 1c). Analysis of the histone modification profiles 
that flank TSSs of Tetl-bound genes revealed two categories of Tet] 
targets (Fig. 3a, b and Supplementary Table 4). The first group is 
associated with bivalent domains, a chromatin state characterized by 
the presence of both H3K4me3 and H3K27me3™. Interestingly, biva- 
lent gene promoters in ES cells are generally hypomethylated’*. In 
contrast, the second group is associated with active histone marks, 
including H3K4me3, H3K4mel and H3K36me3 (Fig. 3a). These data 
indicate that Tet] can associate with both actively transcribed as well as 
repressed target genes. Gene ontology analysis indicated that genes 
related to development and cell differentiation are highly enriched 
in the first group of Tetl targets, whereas genes involved in house- 
keeping functions are enriched in the second group of Tet targets 
(Supplementary Fig. 7). 

The fact that Tetl occupies the promoters of actively transcribed as 
well as repressed genes suggests that Tet] might have a dual function in 


Figure 2 | Tet] maintains a DNA hypomethylated state at Tet1-bound 
regions. a, The distribution frequency of regions enriched with DNA 
methylation is shown for Tetl-bound (blue) and unbound (red) gene 
promoters (left) or CpG islands (right) in mouse ES cells. b, Heatmap 
representation of CpG islands and the changes in DNA methylation (S5mC) in 
response to Tet1 depletion. The DNA methylation gained after Tet1 depletion 
was calculated by deduction of 5mC levels in control knockdown (Con KD) 
from that in Tet] knockdown (Tet1 KD). c, Changes in 5mC levels in response 
to Tet1 knockdown are shown for both CpG-rich and CpG-poor gene 
promoters. Note that proximal promoters and 5’ intragenic regions of CpG- 
rich genes are associated with a higher increase in 5mC levels as compared to 
those of CpG-poor genes in response to Tet1 depletion. d, An increase in 5mC 
levels in response to Tet1 depletion is specifically enriched at the centre of Tet1- 
binding loci. Changes in 5mC levels between control knockdown and Tet1 
knockdown ES cells were determined by co-hybridizing and analysing genomic 
DNA from control knockdown and Tet1 knockdown cells on the whole- 
genome tiling microarrays. 
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transcription regulation. Microarray analysis comparing the gene 
expression of control and Tet1-depleted mouse ES cells identified a 
total of 1,332 genes that are differentially expressed (788 upregulated 
and 544 downregulated in Tet1 knockdown cells) (Supplementary Fig. 
8). Of these differentially expressed genes, a significant percentage 
(80%) are associated with Tet] occupancy within 5 kb up- or down- 
stream of their TSSs (1,067 out of 1,332) (Fig. 3c and Supplementary 
Table 5). Interestingly, despite the fact that DNA methylation has been 
primarily associated with transcriptional repression, more Tet1 targets 
are upregulated rather than downregulated in response to Tet1 deple- 
tion (677 targets are upregulated, P = 2.0 X 10 *°, compared with 390 
targets downregulated, P = 4.1 X 10 °, Fisher’s exact test) (Fig. 3c and 
Supplementary Fig. 8a), indicating that Tet] may also be involved in 
gene repression in mouse ES cells. Notably, genes with known func- 
tions in development and differentiation, for example, Cdx2 (trophec- 
toderm), Sox17 (endoderm) and Krt8 (ectoderm), are among the 
upregulated Tet1 targets (Fig. 3c and Supplementary Fig. 8b). In con- 
trast, genes related to pluripotency and ES cell functions (for example, 
Nanog, Tcll and Esrrb) are among the downregulated Tet1 targets 
(Fig. 3c and Supplementary Fig. 8b). Consistent with the notion that 
changes in gene expression in response to Tet1 depletion are mainly due 
to Tetl-occupancy-mediated effects, instead of a secondary effect due 
to Nanog downregulation, overexpression of Nanog in Tet1-depleted 
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ES cells could only rescue a subset (~30%) of dysregulated Tet1 direct 
targets (Supplementary Fig. 9a, b). Notably, the rescued targets include 
pluripotency-related genes such as Tcl1 and Esrrb. Gene expression 
profiling and qPCR with reverse transcription (RT-qPCR) analysis 
demonstrated that overexpression of Nanog rescued a subset of genes 
through direct (Nanog bound) or indirect (Nanog unbound) regu- 
lation (Supplementary Fig. 9a-c and Supplementary Table 6). 
Collectively, these results indicate that Tetl is not only required for 
maintaining the expression of a subset of genes important for ES 
cell pluripotency, but also required for the repression of a cohort of 
developmental regulators. 

Because many developmental regulators are repressed by Polycomb 
repressive complexes PRC1 and PRC2 (refs 16, 17), we sought to deter- 
mine whether Tet1 might facilitate silencing of developmental regulators 
by promoting Polycomb repression. Comparison of our expression data 
sets to a published data set’® revealed that 43% of Tetl-repressed genes 
were also in the upregulated gene list of Eed-deficient ES cells, which is 
significantly higher than that expected by chance (43% versus 9.5%, 
P=3.59X 10 '"*, Fisher’s exact test), supporting a potential role for 
Tetl in PRC2-mediated repression of developmental regulators. Indeed, 
analysis of the histone modification states of Tetl-regulated genes in 
wild-type ES cells indicated that Tetl-represssed genes were preferen- 
tially associated with H3K27me3 (Fig. 3c), a mark deposited by 


Figure 3 | Tet] binds to and functions in both 
repressed (bivalent) and actively transcribed 
(H3K4me3-only) genes. a, Heatmap 
representation of genomic regions with high- 
density CpG sites (CpG islands), binding profile of 
Tetl, and major histone modifications (H3K4mel1 
(ref. 12), H3K4me3, H3K27me3 and H3K36me3 
(ref. 13)) in mouse ES cells at indicated Tet1 target 
genes (5 kb flanking TSSs). The heatmap is rank- 
ordered from genes with highest H3K27me3 
enrichment to no H3K27me3 within 5-kb genomic 
regions flanking TSSs. The following colour scales 
(white, no enrichment; blue, high enrichment) 
were used for Tet1/H3K27me3/H3K4mel/ 
H3K36me3 and H3K4me3 respectively: (0, 50) and 
(0, 200). WT, wild type. b, Relative percentage of 
genes with different chromatin states shown for all 
genes, Tetl-bound and unbound genes. 

c, Heatmap representation of differentially 
expressed Tet] targets between control knockdown 
and Tet1 knockdown mouse ES cells. Note that 
Tetl-repressed targets are preferentially associated 
with bivalent chromatin states, whereas Tet1- 
activated targets are generally H3K4me3-only 
genes. 
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PRC2'*””. In contrast, Tetl-activated targets were preferentially asso- 
ciated with H3K36me3, a mark associated with transcriptional elonga- 
tion” (Fig. 3c), supporting the notion that Tetl-mediated DNA 
hypomethylation at these gene promoters may facilitate their expression. 

The fact that genes upregulated in response to Tet1 knockdown 
significantly overlap with those upregulated by Eed deficiency indicates 
that Tetl may cooperate with PRC2 in silencing this group of genes. 
Given that the protein levels of PRC2 subunits are not significantly 
altered in response to Tet1 depletion (Supplementary Fig. 10), Tet1 is 
unlikely to affect PRC2 expression or stability. As 95.2% of PRC2- 
binding sites (defined as Ezh2/Suz12 co-bound’') overlapped with 
Tetl-bound loci (Fig. 4a), we next evaluated the effect of Tet1 depletion 
on the chromatin-binding ability of PRC2. ChIP coupled with whole 
genome tiling microarrays (ChIP-chip) in control and Tet1 knockdown 
cells revealed that Tet1 depletion impaired the binding of Ezh2, a core 
subunit of PRC2, to a large fraction (72.2%) of PRC2-binding sites 
(Fig. 4a, b, Supplementary Figs 11, 12a and Supplementary Table 7). 
ChIP-qPCR further confirmed the effect of Tet] knockdown on Ezh2/ 
Suz12 recruitment (Fig. 4c and Supplementary Fig. 12b). Interestingly, 
depletion of Ezh2 did not affect Tet! binding to chromatin (Fig. 4c), 
indicating that Tetl may function upstream of PRC2. Furthermore, 


overexpression of Nanog in Tet1-depleted cells also failed to fully rescue 
the Ezh2 binding to Tet1/PRC2 co-bound targets (Supplementary Fig. 
9d). Given that previous purification of the PRC2 complex did not 
uncover Tet] as an associated component’*””” and the fact that a stable 
interaction between Tetl and PRC2 could not be demonstrated 
(unpublished observation), we favour a model in which Tetl may 
indirectly contribute to PRC2 recruitment by maintaining a DNA 
hypomethylated state at PRC2-bound loci. This model is supported 
by a recent study demonstrating that DNA methylation impedes bind- 
ing of PRC2 to chromatin”. 

In summary, we demonstrate that Tet] is preferentially enriched in 
CpG-island-containing gene promoters in mouse ES cells. This result 
is consistent with the presence of a CXKXC domain in Tetl and the 
demonstration that the CXXC domain is preferentially bound to CpG- 
rich sequences**. The nonrandom genomic distribution of Tet1 sug- 
gests that genes with CpG-rich promoters are selectively regulated by a 
Tetl-dependent epigenetic state (that is, 5amC) or active demethyla- 
tion process. The convergence of CpG-binding proteins at CpG 
islands, including Cfp1, Kdm2a and Tetl, cooperatively contributes 
to the establishment of a specialized chromatin/epigenetic state at 
CpG-rich gene promoters. Specifically, Cfp1 confers H3K4me3 by 
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Figure 4 | Tetl is required for chromatin binding of PRC2 in mouse ES 

cells. a, Tetl depletion affects the binding of PRC2 to the majority of its targets. 
PRC2-binding sites are divided into three groups (Tet1/PRC2 co-bound Tet1 
dependent, Tet] independent and PRC2-only bound). b, Shown are Tet1, Ezh2 
and Suz12 (ref. 21), and H3K27me3 (ref. 13) occupancy, and the effect of Tet1 
depletion on Ezh2 occupancy and 5mC levels at seven representative Tet1- 

repressed bivalent targets. Regions associated with significant changes in Ezh2 
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occupancy between control and Tet1-depleted ES cells were measured by 
whole-genome tiling microarrays. Genomic regions that are further examined 
by locus-specific ChIP-qPCR in ¢ are shaded. c, ChIP-qPCR analysis of Tet1 
(top panels) and Ezh2 (bottom panels) occupancy at the promoters of eight 
representative Tet1-repressed targets in control (Con KD), Tet1-depleted (Tet1 
KD) and Ezh2-depleted (Ezh2 KD) ES cells. Error bars represents standard 
deviation determined from duplicate experiments. 
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recruiting the H3K4me3 methyltransferase Setd1 (ref. 5); Kdm2a leads 
to depletion of H3K36mez2 (ref. 4), and Tetl maintains DNA at a 
hypomethylation state at CpG islands (Fig. 2). 

In addition to binding to gene promoters with CpG islands, Tet] also 
binds to a subset of actively transcribed CpG-poor gene promoters, 
such as Nanog, Tcll and Esrrb, whose gene products have an important 
role in ES cell maintenance. In this scenario, Tet] has an important role 
in promoting the transcriptionally active state of these genes by main- 
taining a hypomethylated promoter state’. Interestingly, Tet1 also con- 
tributes to the silencing of a group of developmental regulators and 
somatic lineage differentiation genes that are silenced by Polycomb 
group proteins (Fig. 3c). Depletion of Tetl leads to a decrease in 
Ezh2 occupancy at many PRC2 targets, indicating that Tet1 contributes 
to PRC2 recruitment. Therefore, our study reveals a novel function for 
Tetl in the recruitment of PRC2 and silencing of developmental regu- 
lators, which also contributes to the role of Tetl in mouse ES cell 
maintenance. We note that, in contrast to our results, a recent study 
has shown that knockdown of Tet1 alone is not sufficient to confer 
any noticeable phenotype in mouse ES cells”. This difference is 
probably due to the use of different ES cell lines, culture conditions 
and knockdown efficiency (see Supplementary Information for details). 
Collectively, our study establishes a dual function for Tet1 in transcrip- 
tional regulation in mouse ES cells. 


METHODS SUMMARY 

Mouse ES cell cultures and lentiviral knockdown. Mouse E14Tg2A ES cells were 
cultured in feeder-free conditions*. For Tet1 knockdown, mouse ES cells were 
infected with lentiviruses expressing both the GFP reporter and short-hairpin 
RNA (shRNA) specific for Tet] (5'-GCAGATGGCCGTGACACAAAT-3’). For 
Ezh2 knockdown, mouse ES cells were infected with lentiviruses expressing both the 
GFP reporter and shRNA specific for Ezh2 (5'-GTATGTGGGCATCGAACGA- 
3’) as previously described’. All analyses were performed using Tetl- or Ezh2- 
depleted ES cells that were purified on the basis of GFP fluorescence by FACS 
8 days after lentiviral transduction. Lentiviruses expressing GFP alone was used 
as a control. 

ChIP-seq and data analysis. ChIP and sequencing experiments were performed 
as described’”*. Briefly, cells were cross-linked with 1% formaldehyde at 25 °C for 
10 min and sonicated to generate chromatin fragments of 200-500 bp. Chromatin 
fragments from 10-20 million cells were immunoprecipitated using 8 1g of the 
Tet antibody’ or IgG control from two biologically independent samples. ChIP- 
seq library construction and Illumina sequencing were performed as described 
previously**. All sequencing reads were mapped to the mouse genome (NCBI 
Build 36/UCSC mm&8). Sequencing reads from two independent Tetl ChIP-seq 
experiments were combined and Tetl-enriched regions were identified by the 
MACS program’’. Sequencing reads from IgG control experiments were used as 
negative controls in MACS. The statistical cutoff used for identifying Tet1-binding 
sites was a P value < 10 * and fold enrichment (over IgG control) > 10. 
Genome-wide DNA methylation (5mC) analysis. Methylated DNA immuno- 
precipitation (MeDIP) coupled with whole-genome DNA tiling microarrays were 
performed as described**. Immunoprecipitated DNA was prepared from both 
control and Tet1-depleted ES cells, and hybridized to mouse whole-genome tiling 
microarrays (NimbleGen). 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Constructs and antibodies. All the constructs and antibodies used in this study 
have been described previously~"* or were purchased from the following sources: 
EZH2 (Cell Signaling; catalogue no. 4905); EED (Santa Cruz; sc-133537); Jarid2 
(Abcam; ab48137); AEBP2 (Proteintech group; 11232-2-AP), EZH1 (Abcam; 
ab64850) and actin (Sigma; AC-40). 

Mouse ES cell cultures and lentiviral knockdown. Mouse E14Tg2A ES cells were 
cultured in feeder-free conditions’. For Tet! knockdown, mouse ES cells were infected 
with lentiviruses expressing both the GFP reporter and short-hairpin RNA (shRNA) 
specific for Tetl (5'-GCAGATGGCCGTGACACAAAT-3’). For Ezh2 knockdown, 
mouse ES cells were infected with lentiviruses expressing both the GFP reporter and 
shRNA specific for Ezh2 (5'-GIATGTGGGCATCGAACGA-3’) as_ previously 
described”. All analyses were performed using Tet1- or Ezh2-depleted ES cells that were 
purified on the basis of GFP fluorescence by FACS 8 days after lentiviral transduction. 
Lentiviruses expressing GFP alone was used as a control. 

RNA isolation, qPCR and expression microarray analysis. Total RNA from 
cultured cells was isolated using RNeasy Mini Kit (Qiagen), and cDNA was generated 
with Improm-IITM Reverse Transcription System (Promega). Real-time qPCR reac- 
tions were performed on an ABI PRISM 7700 Sequence Detection System (Applied 
Biosystems) using SYBR Green (Invitrogen). cDNA levels of target genes were ana- 
lysed using comparative Cy methods, where Cy is the cycle threshold number and 
normalized to GAPDH. RT-qPCR primers are listed in Supplementary Table 8. 

For expression microarray analysis comparing control and Tet! knockdown ES 
cells, 2 ug of total RNA purified from GFP sorted cells were reverse-transcribed into 
cDNA with a T7-(dT)24 primer from a custom kit (Life Technologies). Biotinylated 
cRNA was then generated from the cDNA reaction using the BioArray High Yield 
RNA Transcript Kit. The CRNA was then fragmented in fragmentation buffer (40 mM 
Tris-acetate, pH 8.1, 100 mM KOAcand 150 mM MgOAc) at 94 °C for 35 min before 
microarray hybridization. Fifteen micrograms of fragmented cRNA was then added to 
a hybridization cocktail (0.05 mgml~’ fragmented cRNA, 50 pM control oligonu- 
cleotide B2, BioB, BioC, BioD and cre hybridization controls, 0.1 mg ml! herring 
sperm DNA, 0.5mgml ’ acetylated BSA, 100 mM MES, 1M Na“, 20mM EDTA, 
0.01% Tween 20). Ten micrograms of cRNA were used for hybridization to 
Affymetrix GeneChip Mouse Genome 430 2.0 Array. Hybridization was carried 
out at 45°C for 16h. The arrays were then washed and stained with 
R-phycoerythrin streptavidin, before scanning. Washing, scanning and basic analysis 
was carried out using Affymetrix GeneChip Microarray Suite 5.0 software. Raw signal 
intensity (.cel files) was RMA normalized using affy (R/bioconductor). For identifica- 
tion of differentially expressed genes, we used NIA array analysis tool (http:// 
Igsun.gre.nia.nih.gov/ANOVA). Of all the probes present on the microarray, signal 
intensity of redundant probes was averaged before analysis. The following parameters 
were used for analysing statistically significant differential expression: threshold 
z-value to remove outliers, 10,000; Error Model, Max (Average, Bayesian); error 
variance averaging window, 200; proportion of highest error variances to be removed, 
0.05; Bayesian degrees of freedom, 20; the FDR threshold was set at 0.05. 

For heatmap display, RMA-normalized signal intensity was log2 transformed 
and median-centred. Heatmaps were generated using Cluster3 and Java Treeview. 
ChIP-seq. ChIP-seq experiments were performed as described”. Briefly, cells were 
cross-linked with 1% formaldehyde at 25°C for 10 min and sonicated to generate 
chromatin fragments of 200-500 bp. Chromatin fragments from 10-20 x 10° cells 
were immunoprecipitated using 8 1g of the Tet1 antibody’ or IgG control from two 
biologically independent samples. ChIP-seq library construction and Illumina sequen- 
cing were performed as described previously”®. All sequencing reads were mapped to 
the mouse genome (mm&8). Sequencing reads from both Tet] ChIP-seq experiments 
were combined and Tetl-enriched regions were determined by the MACS program 
(version 1.3.7.1). Sequencing reads from IgG control experiments were used as negative 
controls in MACS. Only uniquely mapped reads were retained and redundant reads 
were filtered out. The statistical cutoff used for identifying Tet1-binding sites was P 
value < 10 * (or FDR < 1%) and fold enrichment (over IgG control) > 10. ChIP-seq 
data sets of H3K4mel (ref. 12), H3K4me3, H3K27me3, H3K36me3 (ref. 13), Ezh2, 
Suz12 (ref. 21), Kdm2a (ref. 4) and RNA pol II (ref. 28) were obtained from previous 
publications and reanalysed in MACS using identical parameters (except statistical 
cutoff was set to Pvalue < 10°). A summary of all ChIP-seq experiments used in this 
study (generated by this work and by previous publications) is provided in 
Supplementary Table 3. ChIP-seq sequencing read counts for each ChIP-seq experi- 
ments were binned into 400-bp windows at 100-bp steps along the genome and visua- 
lized in the Cisgenome browser”. To assign ChIP-seq enriched regions to genes, a 
complete set of Refseq genes was downloaded from the UCSC table browser (accessed 
May, 2010). For all data sets, genes with enriched regions within 5 kb of their TSSs were 
called bound. 

Gene ontology analysis. Functional enrichment analysis of bivalent and H3K4me3- 
only Tetl were calculated by hypergeometric distribution followed by Benjamini 
correction in DAVID. 


Genome-wide DNA methylation (5mC) analysis. Methylated DNA immunopre- 
cipitation (MeDIP) was performed as described previously with minor modifica- 
tions*. Briefly, genomic DNA was sequentially digested with proteinase K and RNase 
A, and purified by phenol/chloroform extraction. Purified genomic DNA was soni- 
cated and heat denatured (95 °C, 10 min). An aliquot of sonicated genomic DNA was 
saved as input. Five micrograms of fragmented genomic DNA was immunoprecipi- 
tated with 5 pl of a monoclonal antibody against 5-methylcytidine (Eurogentec) at 
4 °C overnight ina final volume of 500 kl of IP buffer (10 mM sodium phosphate (pH 
7.0), 140 mM NaCl, 0.05% Triton X-100). We incubated the DNA-antibody mixture 
with 30 il protein G Dynabeads (Invitrogen) for 2 h at 4 °C and washed it three times 
with 1 ml IP buffer. We then treated the beads with proteinase K for at least 3h at 
55 °C and purified the methylated DNA by phenol-chloroform extraction followed 
by ethanol precipitation. For whole-genome DNA tiling microarray analysis, immu- 
noprecipitated DNA prepared from both control and Tet1-depleted ES cells were co- 
hybridized to mouse whole-genome tiling microarrays (NimbleGen). 
Whole-genome tiling microarray analysis. For whole-genome DNA tiling 
microarray analysis of relative changes in 5mC levels or Ezh2 occupancy, immu- 
noprecipitated DNA was prepared from both control and Tet1-depleted ES cells 
and amplified using whole genome amplification kit (Sigma). Amplified DNA was 
labelled (5’ Cy5- or Cy3-random nonamers, TriLink Biotechnologies) using the 
standard protocol (NimbleGen Arrays User's Guide for ChIP-chip analysis). 
Hybridization of labelled samples to whole genome HD2 microarrays 4-array 
set (Roche/NimbleGen, ~2.1 million tiling probes per array, covering the entire 
non-repetitive portion of mouse genome) was carried out for 16-20h at 42 °C 
using NimbleGen hybridization System 4. After stringent washes, microarrays 
were subsequently scanned using an Agilent scanner at 5-j1m resolution. Data 
were extracted and analysed using NimbleScan v2.5 (Roche/NimbleGen). 

For identification of probes associated with significant increase in 5mC levels or 
decrease in Ezh2 occupancy in response to Tet1 depletion in microarray experi- 
ments with the IP/IP configuration (DNA from control knockdown and Tet1 
knockdown were co-hybridized to the same microarrays), a non-parametric one- 
sided Kolmogorov-Smirno (KS) test was used (KS score). Briefly, from the scaled 
log,-ratio data, a fixed-length window (750 bp) is placed around each consecutive 
probe and the one-sided KS test is applied to determine whether the probes are 
drawn from a significantly more positive distribution of intensity log-ratios than 
those in the rest of the array. The resulting score for each probe is the —log)9 P value 
from the windowed KS test around that probe. Using NimbleScan v2.5, peak data 
files are generated from the P-value data files. NimbleScan software detects peaks by 
searching for at least 2 probes above a P-value minimum cutoff (—log;9) of 2. Peaks 
within 500 bp of each other are merged. For calculating the absolute 5mC levels in 
control knockdown and Tet1 knockdown ES cells (Supplementary Fig. 5a), the 
MEDME program” was used to correct the nonlinear relationship between 
MeDIP-chip signals (measured by microarray experiments with the IP/input con- 
figuration) and genomic CpG density. 

For visualizing raw microarray signal intensity in the genome browser, probe 
level smoothing (log2 ratios of probes within 1 kb are averaged) was performed for 
each probe. For calculating the peak distribution, regions associated with signifi- 
cant changes in 5mC levels or Ezh2 occupancy were binned to 500-bp intervals 
using a 250-bp sliding window within genomic regions 5-kb up- and downstream 
of TSSs of annotated Refseq genes. Heatmaps were generated and visualized using 
Cluster3 and Java TreeView, respectively. 

Locus-specific ChIP assays and bisulphite sequencing. Cells were fixed in a final 
concentration of 1% formaldehyde. After incubation at 25°C for 10 min, the 
reaction was stopped by the addition of 125mM glycine. ChIP assays were per- 
formed using a protocol associated with the ChIP assay kit (Upstate 
Biotechnology). After extensive washing, ChIPed DNA was eluted from the beads, 
and analysed on an ABI 7300 Real Time PCR System (Applied Biosystems) using 
SYBR Green (Invitrogen). Primer sequences are listed in Supplementary Table 9. 

Bisulphite sequencing was performed as described previously with minor modi- 
fications’. Five micrograms of sodium-bisulphite-treated DNA samples was sub- 
jected to PCR amplification using the first set of primers; PCR products were used as 
templates for a subsequent PCR reaction using nested primers. The PCR products 
of the second reaction were then subcloned using the Invitrogen TA cloning Kit 
following the manufacturer’s instructions. PCRs and subcloning were performed in 
duplicate for each sample. The clones were sequenced using the M13 reverse 
primer. Primers for bisulphite sequencing are listed in Supplementary Table 10. 
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Genome-wide mapping of 5-hydroxymethylcytosine 


in embryonic stem cells 
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5-hydroxymethylcytosine (ShmC) is a modified base present at low 
levels in diverse cell types in mammals’ *. 5hmC is generated by the 
TET family of Fe(II) and 2-oxoglutarate-dependent enzymes through 
oxidation of 5-methylcytosine (5mC)'**”. 5hmC and TET proteins 
have been implicated in stem cell biology and cancer’***”, but 
information on the genome-wide distribution of 5hmC is limited. 
Here we describe two novel and specific approaches to profile the 
genomic localization of 5hmC. The first approach, termed GLIB 
(glucosylation, periodate oxidation, biotinylation) uses a combina- 
tion of enzymatic and chemical steps to isolate DNA fragments con- 
taining as few as a single 5hmC. The second approach involves 
conversion of 5hmC to cytosine 5-methylenesulphonate (CMS) by 
treatment of genomic DNA with sodium bisulphite, followed by 
immunoprecipitation of CMS-containing DNA with a specific anti- 
serum to CMS’. High-throughput sequencing of 5hmC-containing 
DNA from mouse embryonic stem (ES) cells showed strong enrich- 
ment within exons and near transcriptional start sites. 54mC was 
especially enriched at the start sites of genes whose promoters bear 
dual histone 3 lysine 27 trimethylation (H3K27me3) and histone 3 
lysine 4 trimethylation (H3K4me3) marks. Our results indicate that 
5hmC has a probable role in transcriptional regulation, and suggest a 
model in which 5hmC contributes to the ‘poised’ chromatin sig- 
nature found at developmentally-regulated genes in ES cells. 

We developed two independent methods for precipitation of 5hmC in 
genomic DNA. The GLIB method (Fig. 1a) entails addition of a glucose 
molecule to each 5hmC with T4 phage B-glucosyltransferase (BGT)° 
(Supplementary Fig. 1a). The glucose moiety is oxidized with sodium 
periodate, which converts the vicinal hydroxyl groups to aldehydes", 
and further modified with aldehyde-reactive probe, which adds two 
biotin molecules to each 5hmC (Fig. 1a). A related strategy, which uses 
a custom-synthesized UDP-glucose analogue (UDP-6-N3-glucose), was 
recently used to profile 5hmC distribution in mouse brain’’. The second 
method uses an antibody against cytosine 5-methylenesulphonate 
(CMS)°, produced by reaction of 5hmC with sodium bisulphite 
(Fig. 1b)'*. Anti-CMS antibodies are more sensitive and less density- 
dependent than anti-5hmC in DNA dot blot assays°. Both methods are 
specific for DNA containing 5hmC (Supplementary Fig. 1b)°. 

We examined the ability of GLIB-treated (biotinylated) and 
bisulphite-treated 5hmC-containing DNA to be pulled down by strep- 
tavidin and anti-CMS antisera, respectively. Using varying ratios of 
dCTP:dhmCTP, we generated 201 base pairs PCR amplicons with 
differing incorporation of cytosine and 5hmC in identical sequence 
contexts (Supplementary Table 1). At each dhmCTP:dCTP ratio, the 
fraction of amplicons that contain no 5hmC, and therefore should not 
be precipitated, can be calculated using the binomial equation (Sup- 
plementary Table 2). Observed and calculated pull-down efficiencies 


were very similar (Fig. 1c): even at low densities of 5amC, more than 
90% of DNA fragments calculated to contain a single 5hmC were 
precipitated after GLIB treatment. Anti-CMS pull-down showed 
increased density dependence compared to GLIB, but had very low 
background, such that there was still a strong preference for precipi- 
tation of sparsely hydroxymethylated amplicons over unmodified 
ones (Fig. 1d). The performance of a commercial polyclonal anti- 
hmC antiserum was inferior to that of anti-CMS, in terms of higher 
background pull-down of unmodified DNA (3.0% versus 0.06%) as well 
as greater density dependence (Fig. le). By testing PCR amplicons with 
varying 5mC, we confirmed that the methyl-DNA immunoprecipitation 
(MeDIP) technique, which uses a monoclonal antibody to 5mC, is 
extremely density-dependent (Fig. 1f). 

We applied the GLIB and anti-CMS techniques to enrich 5hmC- 
containing regions in genomic DNA using genomic DNA with low, 
intermediate and high levels of 5hmC (Supplementary Fig. Ic, left 
panel). For the GLIB and anti-CMS pull-downs, the amount of specif- 
ically precipitated genomic DNA was proportional to the relative 
amount of 5hmC (Supplementary Fig. 1c). The GLIB technique did 
not produce mutations (Supplementary Table 3), the biotinylated DNA 
could be efficiently eluted by heating with formamide (Supplementary 
Fig. 1d), and the biotinylated adduct had a minimal inhibitory effect 
on PCR at 5-25% hmC density (delay of approximately 0.1 cycles per 
converted 5hmC residue; Supplementary Fig. 1f). There was no PCR 
delay with CMS-containing PCR amplicons except at very high CMS 
levels (Supplementary Fig. le), consistent with our previous report 
that CMS inhibits PCR predominantly at biologically irrelevant 
sequences where multiple CMS adducts occur in a row”. 

We investigated the genome-wide localization of 5hmC in murine 
V6.5 ES cells. For GLIB-treated DNA, we chose Helicos single mol- 
ecule DNA sequencing, which does not require an amplification step 
and thus avoids PCR bias’*'*. For CMS-enriched genomic DNA, we 
used an Illumina instrument, as longer read lengths are needed for 
efficient alignment of bisulphite-treated DNA to the genome’’. With 
the GLIB method, 119,600 regions of the genome, averaging 1,422 bp 
in length, showed a substantially higher density of reads in the + BGT 
as opposed to the -BGT sample; with the CMS method, comparison of 
enriched to input DNA identified 109,264 enriched regions (average 
length 1,168 bp). There was high overlap in the enriched regions, here 
designated 5hmC-enriched regions of the genome (HERGs) (Fig. 1g). 
Comparing the number of HERGs retrieved by using different frac- 
tions of aligned reads yielded a curve that approached an asymptote, 
suggesting that a majority of hydroxymethylated regions had been 
identified (Supplementary Fig. 2a). 

To determine whether HERGs overlapped with methylated DNA 
regions, we identified 62,991 5mC-enriched regions of the genome 
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Figure 1 | Comparison of 5hmC enrichment methods. a, The GLIB method. 
Glucose is added to 5hmC by BGT, oxidized with sodium periodate to yield 
aldehydes, and reacted with the aldehyde reactive probe (ARP), yielding two 
biotins at the site of every 5hmC. b, 5hmC is converted to CMS by sodium 
bisulphite. c-f, Precipitation of PCR amplicons containing (1) varying amounts 
of ShmC by GLIB methodology (c), anti-CMS methodology (d), or anti-ShmC 
antibody (e); or (2) varying amounts of 5mC by anti-5mC antibody (f). pAb, 
polyclonal antibody; mAb, monoclonal antibody. Between 1 and 6 independent 
experiments per method, mean percentage input precipitated + s.d. is 
indicated. g, Overlap between HERGs identified by the GLIB and anti-CMS 
methodologies. Left panel, number of HERGs; right panel, number of base pairs 
contained within HERGs. 


(MERGs) by MeDIP. The resulting 5mC profile does not represent a 
complete map of 5mC in mouse ES cells, but rather is biased towards 
regions of dense methylation. Statistics pertaining to the GLIB, anti- 
CMS and MeDIP enrichments are shown in Supplementary Figs 2b-d, 
the corresponding annotations are provided in Supplementary Tables 
4-9, and reads and enrichment for the Hoxb locus are provided in 
Supplementary Table 10. As expected, both HERGs and MERGs con- 
tained a high frequency of CG sequences relative to the genome at large 
(Supplementary Fig. 3a). Intriguingly, HERGs also contained relatively 
high levels of CAG sequences, the most frequent site of non-CpG 
methylation in human ES cells!®, and we confirmed that the TET1 
catalytic domain is capable of hydroxylating 5mC in CHG and CHH 
(H =A, T or C) contexts in vitro (Supplementary Fig. 3b). 

Analysis of the GLIB and anti-CMS HERG sets gave very similar 
results. We observed a strong correlation between the densities of 
HERGs and genes on a given chromosome; this trend was less pro- 
nounced for MERGs (Fig. 2a). When we compared the distribution of 
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Figure 2 | Genomic distribution of 5hmC or 5mC enriched regions of the 
genome. a, Correlation of HERG or MERG density on each chromosome (y- 
axis) with gene density in the same chromosome (x-axis). Density is defined as 
frequency divided by chromosome length. b, c, Both HERGs and MERGs are 
enriched in transcribed regions (b), whereas HERGs are preferentially enriched 
at enhancers and the start sites of genes (c). The percentage of HERGs or 
MERGs mapping to the indicated genomic feature (darker bar) is compared 
with the percentage of randomly chosen sequences mapping to that feature 
(lighter bar). 5’ UTR, 5’ untranslated region. TSS, transcription start site 
(—800 bp to +200 bp relative to start of transcription). See Supplementary 
Methods for detailed definition of how HERGs or MERGs were classified as 
mapping to genomic features. d, Distribution of HERGs and MERGs relative to 
the TSS. The centre of each HERG was plotted relative to the nearest TSS in 
1,000 bp increments from —10 kb to +10kb surrounding the TSS. 


HERGs and MERGs to the distribution of DNA fragments of equival- 
ent length distributed randomly across the genome, both 5hmC and 
5mC were enriched within transcribed regions, particularly exons, 
which are known to be sites of high CpG density’’ as well as high 
DNA methylation’® (Fig. 2b and Supplementary Fig. 3c). However, 
only ShmC was enriched at transcription start sites (TSSs) and within 
the 5’ untranslated regions (UTRs) of genes (Fig. 2c). Moreover, 54mC 
was relatively more enriched in enhancers (defined by H3K4mel in the 
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absence of H3K4me3)"” than 5mC, strongly indicating a connection 
between 5hmC and regulatory elements (Fig. 2c). Plotting each HERG 
as a single point relative to the nearest TSS, we found that 5hmC is 
heavily enriched both 5’ and 3’ of the TSS, whereas 5mC is enriched 
primarily 3’ of the TSS (Fig. 2d). These results show a unique distri- 
bution of 5hmC in regulatory elements of genes, one that is not 
explained simply by the distribution of 5mC, the substrate for TET 
enzymes. 

The enrichment of 5hmC at the TSS suggested a role for ShmC in 
transcriptional regulation. To evaluate this possibility, we used pub- 
lished data sets on gene expression”*”' and histone modification’**”* 
profiles in mouse ES cells to compare the sets of genes with 5hmC or 
5mC at their start sites (Supplementary Tables 11-13) to the set of all 
genes in the genome. 5hmC is preferentially found at promoters with 
high or intermediate CpG content (Supplementary Fig. 4a), even 
though high CpG promoters are hypomethylated in ES cells’®'*7*, 
This distribution is consistent with the possibility that TET proteins 
are preferentially recruited to high CpG regions through their CpG- 
binding CXXC domains®”’. 

In ES cells, genes with ‘bivalent’ H3K27 and H3K4 trimethylation 
are transcriptionally inactive but poised for expression upon differ- 
entiation to embryoid bodies”’’*”’”. We found that genes with 5hmC 
at their start sites were disproportionately likely to contain bivalent 
domains at their promoters; likewise, a majority (~60%) of genes 
reported to contain bivalent domains have 5hmC at their start sites 
(Fig. 3a). 5hmC was less likely to be found at genes with the activating 
‘H3K4me3 only’ mark than is predicted by chance. Moreover, genes 
with 5hmC at their start sites showed lower expression in murine ES 
cells than other genes (Fig. 3b) and were more likely to be upregulated 
upon embryoid body differentiation (Fig. 3c). The correlation of 5hmC 
with bivalent domains holds even after adjusting for the known rela- 
tion between promoter CpG content and bivalency* (Supplementary 
Fig. 5). Although 5mC at the TSS also correlates with lower gene 
expression in murine ES cells (Supplementary Fig. 4b), 5mC is not 
enriched at the promoters of genes with bivalent domains* (Sup- 
plementary Fig. 4c), and genes with high levels of 5mC did not tend 
to be upregulated upon embryoid body differentiation (Supplemen- 
tary Fig. 4d). Thus 5hmC is preferentially enriched at the promoters of 
genes with bivalent histone marks in ES cells, indicating that 5hmC 
may contribute functionally to the ‘poised’ but inactive state of these 
genes in ES cells. 

Genes with 5hmC at their start sites are also disproportionately 
enriched in the set of genes whose promoters bind polycomb repressor 
complex (PRC) components, and in a majority of genes with the 
‘H3K27me3’ only mark (Fig. 3a). There is a statistically significant 
correlation between genes that had 5hmC at the TSS and genes that 
were upregulated upon small interfering RNA-mediated Tetl deple- 
tion® (therefore, negatively regulated by Tet1) (Fig. 3d), indicating that 
5hmC in the promoter region has a negative role in the transcription of 
some genes in ES cells. Unlike 5mC, however, 5hmC is not substan- 
tially enriched at sites of heterochromatic H3K9 or H4K20 trimethyla- 
tion” (data not shown). 

Collectively, our results support a model in which 5hmC and 5mC 
have different roles in transcription. Like 5mC”*, 5hmC at promoters is 
predictive of lower levels of gene expression. However, 5hmC is 
uniquely associated with a ‘poised’ chromatin configuration and with 
genes that are upregulated upon differentiation, and may thus be 
involved in priming loci for rapid activation in response to appropriate 
signals. Activation of lineage-specific genetic loci upon differentiation 
could occur via a postulated 5mC ‘demethylation’ pathway (5mC to 
5hmC to cytosine)’ or through recruitment of transcriptional regulators 
that specifically recognize 5hmC and are activated in response to dif- 
ferentiation signals. The ability to profile 5amC even at sparsely hydro- 
xymethylated loci will allow a careful evaluation of these possibilities in 
differentiating cells. 
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Figure 3 | Properties of HERGs at transcription start sites. a, The percentage 
of genes with ShmC at the TSS (blue and red bars) reported to contain histone 
H3 trimethylation (left) or PRC components (right) at their promoters is 
compared to the fraction of all genes (grey bars) with these promoter marks”. 
Number of genes in each category is indicated. b, HERGs are enriched at the 
TSSs of genes with low expression in ES cells. All genes were ranked by level of 
expression in ES cells*' and sorted into deciles from lowest to highest. The per 
cent of genes within the decile category with 5-hmC enriched at the TSS (left) or 
within gene bodies (right) are shown for each methodology. The first five 
deciles, which are comprised of genes lacking statistically significant expression, 
are pooled and averaged in this analysis. c, HERGs are enriched at the TSS of 
genes upregulated upon differentiation to embryoid bodies (EB)*°. The 
percentage of genes with 5-hmC at their TSS (blue bars) that are substantially 
upregulated or downregulated upon differentiation to EB is compared with the 
percentage of total genes similarly regulated (grey bars). Number of genes in 
each category is indicated. d, Overlap between genes with 5hmC at the TSS and 
genes positively or negatively regulated by Tet] (ref. 8). 
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METHODS SUMMARY 


GLIB precipitation. V6.5 ES cells were lysed and proteins digested by treatment 
with Proteinase K at 55 °C. DNA was purified by phenol-chloroform extraction and 
then precipitated with ethanol. RNA was removed with RNase A (Qiagen). Samples 
were treated with 20 ng BGT per 1 jig DNA at 30 °C for 3 h (50 mM HEPES pH 8.0, 
25mM MgCl, 50 4M UDPG for 3h at 30 °C), then oxidized with 23 mM sodium 
periodate 16 h at 22 °C in 0.1 M sodium phosphate pH 7.0. Periodate was quenched 
by the addition of 46 mM sodium sulphite at room temperature for 10 min, then 
exchanged into 1X PBS and incubated with 2mM Aldehyde Reactive Probe 
(Invitrogen) for 1h at 37°C. DNA was sequenced with a HeliScope Single 
Molecule Sequencer. See Supplementary Methods for detailed protocol. 

CMS precipitation. The generation of the anti-CMS antibody is described else- 
where’. DNA fragments were ligated with methylated adaptors and treated with 
sodium bisulphite (Qiagen). The DNA was then denatured for 10 min at 95 °C 
(0.4M NaOH, 10mM EDTA), neutralized by addition of cold 2M ammonium 
acetate pH 7.0, incubated with anti-CMS antiserum in 1 immunoprecipitation 
buffer (10 mM sodium phosphate pH 7.0, 140 mM NaCl, 0.05% Triton X-100) for 
2h at 4°C, and then precipitated with Protein G beads. Precipitated DNA was 
eluted with Proteinase K, purified by phenol-chloroform extraction, and amplified 
by 4-6 cycles PCR using Pfu TurboCx hotstart DNA polymerase (Stratagene). 
DNA sequencing was carried out using Ilumina/Solexa Genome Analyzer II and 
HiSeq sequencing systems. 
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Methylation at the 5’ position of cytosine in DNA has important roles 
in genome function and is dynamically reprogrammed during early 
embryonic and germ cell development’. The mammalian genome 
also contains 5-hydroxymethylcytosine (ShmC), which seems to be 
generated by oxidation of 5-methylcytosine (S5mC) by the TET family 
of enzymes that are highly expressed in embryonic stem (ES) cells”. 
Here we use antibodies against 5hmC and 5mC together with high 
throughput sequencing to determine genome-wide patterns of 
methylation and hydroxymethylation in mouse wild-type and 
mutant ES cells and differentiating embryoid bodies. We find that 
5hmC is mostly associated with euchromatin and that whereas 5mC 
is under-represented at gene promoters and CpG islands, 5hmC is 
enriched and is associated with increased transcriptional levels. Most, 
if not all, 5amC in the genome depends on pre-existing 5mC and the 
balance between these two modifications is different between geno- 
mic regions. Knockdown of Tet1 and Tet2 causes downregulation of 
a group of genes that includes pluripotency-related genes (including 
Esrrb, Prdm14, Dppa3, KIf2, Tcll and Zfp42) and a concomitant 
increase in methylation of their promoters, together with an 
increased propensity of ES cells for extraembryonic lineage differ- 
entiation. Declining levels of TETs during differentiation are asso- 
ciated with decreased hydroxymethylation levels at the promoters of 
ES cell-specific genes together with increased methylation and gene 
silencing. We propose that the balance between hydroxymethylation 
and methylation in the genome is inextricably linked with the balance 
between pluripotency and lineage commitment. 

5hmC occurs in ES cells (in 5% of CpGs), Purkinje cells in the mouse 
brain, and in other adult mouse tissues?*. The TET1 and TET2 
enzymes, which can oxidise 5mC thus generating 5hmC, are highly 
expressed in ES cells and regulate the expression of pluripotency-related 
genes together with the potential of ES cells to differentiate into the 
embryonic and extraembryonic lineages*®. The genomic distribution of 
5hmC in the ES cell genome and during differentiation and its relation 
to the distribution to 5mC is unknown. Because bisulphite conversion 
and high throughput sequencing (BS-Seq) does not distinguish between 
5mC and 5hmC’, we used specific antibodies (Fig. 1a and Supplemen- 
tary Fig. 1) to determine the genomic distribution of both 5mC and 
5hmC by MeDIP-Seq*® and hMeDIP-Seq (methylated DNA immuno- 
precipitation and hydroxymethylated DNA immunoprecipitation fol- 
lowed by high throughput sequencing, respectively) in two different ES 
cell lines (J1, E14), Np9s / ~ ES cells (lacking maintenance methyla- 
tion), Tet1/2 knockdown ES cells, and embryoid bodies (EBs). We 
obtained 19-33 million paired-end reads for each sample; all samples 
were sequenced in two biological replicates which were found to be 
highly reproducible (Supplementary Table 1 and Supplementary Fig. 
2). Note that (h)MeDIP-Seq profiles (as chromatin immunoprecipita- 
tion (ChIP)-Seq profiles) reveal only the relative distribution of the 
respective modification within a sample and therefore cannot be used 
to infer absolute quantitative differences between samples or antibodies. 


By immunofluorescence we found strong nuclear staining for 5ShmC 
in ES cells (and in other cell types) that broadly overlapped in euchro- 
matic regions with staining for 5mC, whereas DAPI-dense heterochro- 
matic regions are highly enriched for 5mC but not ShmC (Fig. 1b and 
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Figure 1 | Distribution of 5-hydroxymethylcytosine in the mouse genome. 
a, The specificities of the antibodies used in this study were confirmed by dot 
blot and (h)MeDIP using PCR fragments containing 5hmC, 5mC or C. 

b, Immunofluorescence co-staining of J1 ES cells with antibodies against 5amC 
(green) and 5mC (red). Grey scale images of the two modifications are shown 
separately. Staining for 5mC is particularly strong in pericentromeric 
heterochromatin (arrows), contrary to 5hmC. Scale bar, 10 jim. c, Examples of 
hMeDIP-Seq and MeDIP-Seq profiles at a genomic region on Chr2 in J1 ES 
cells. d, Relative enrichment (log, bound/unbound) of ShmC and 5mC in 
repetitive sequences in J1 and E14 ES cells and E14 EBs. e, Enrichment of 5amC 
and 5mC in single-copy genomic features. Values in d and e represent means of 
two biological replicates with the ends of the error bars corresponding to the 
individual data points. f, Validation of the presence of 5hmC in CGIs using 
glucMS-qPCR (grey bars represent mean + s.d.). Selected CGIs (black bars, 
upper panel) were tested for the presence of ShmC at particular MspI sites (grey 
vertical line). Genomic coordinates of the left-most base pairs of each region: 
Ctnna3 (chr10, 63044495); Zfp64 (chr2, 168750875); Bend3 (chr10, 43230661); 
EG240055 (also known as Neurl1b: chr17, 26567975). 


1Laboratory of Developmental Genetics and Imprinting, The Babraham Institute, Cambridge CB22 3AT, UK. “Bioinformatics Group, The Babraham Institute, Cambridge CB22 3AT, UK. °Centre for 
Trophoblast Research, University of Cambridge, Cambridge CB2 3EG, UK. +Present address: Genetics Department, Faculty of Medicine, University of Porto, 4200-319 Porto, Portugal. 


*These authors contributed equally to this work. 


398 | NATURE | VOL 473 | 19 MAY 2011 


©2011 Macmillan Publishers Limited. All rights reserved 


Supplementary Fig. 3). (h)MeDIP-Seq confirmed that ShmC is widely 
distributed throughout non-repetitive regions (see example in Fig. 1c) 
and substantially overlaps with the distribution of 5mC, whereas satellite 
repeats (which are located in heterochromatin) are highly enriched for 
5mC but substantially less for 5amC (Fig. 1d and Supplementary Fig. 4). 
In single-copy regions the distribution of 5hmC in ES cells follows a 
broadly similar pattern to that of 5mC in intergenic regions, exons and 
introns, with a higher enrichment in exons over introns (Fig. le). 
Notably, whereas 5mC is relatively depleted from CpG islands 
(CGIs), gene promoters, the 5’ ends of LINE] elements (their promo- 
ters), CTCF and pluripotency transcription factor binding sites in 
accordance with previous work’, 5hmC is relatively enriched in all of 
these (Fig. 1d, e and Supplementary Figs 5 and 6). Furthermore, upon 
differentiation into EBs 5hmC enrichment decreases in these regions, 
concomitant with a gain of 5mC. Consistent with the distinct 5mC and 
5hmC patterns at CGIs, whereas 5mC is depleted from high CpG 
density promoters (as described previously'’), 5amC remains enriched 
(Supplementary Fig. 7), indicating that the ratio of 5hmC to 5mC is 
higher here than in low CpG density promoters. To independently and 
quantitatively verify the presence of 5hmC in CGlIs we carried out 
glucosylation of 5amC in genomic DNA followed by MspI digestion 
(which does not digest glucosylated 5hmC) and quantitative PCR 
(qPCR) across MspI sites (glucMS-qPCR)"’. We found significant levels 
of 5hmC (3-24%) in selected CGIs (Fig. 1f and Supplementary Fig. 8). 
We also determined the corresponding 5mC levels (see Methods) and 
found them to be comparable to those of 5amC at these CGIs, whereas 
elsewhere 5mC can be several fold higher than 5hmC (Supplementary 
Fig. 8, compare regions B and C). These measurements suggest that 
5hmC is still derived from 5mC at CGls, but that a high proportion 
of 5mC is converted to ShmC in these regions. We also confirmed the 
presence of cytosine modifications in these regions by bisulphite con- 
version followed by Sequenom MassARRAY analysis (Supplementary 
Fig. 8). 

We found by thin layer chromatography analysis that 5amC was 
reduced in Tet1/2 knockdown cells and Np95 a cells, and eliminated 
in Dnmt1~!~/Dnmt3a_'~/Dnmt3b-'~ triple knockout (TKO) ES 
cells (Fig. 2a). We confirmed this by glucMS-qPCR on selected regions 
and all were found to display lower levels of both modifications in 
Np95_‘~ ES cells and only vestigial amounts in TKO cells (Fig. 2b). We 
find losses of 5amC enrichment at exons, 5’ regions of LINE] elements 
and CTCF binding sites in Np95-‘~ and Tet1/2 knockdown ES cells 
(Fig. 2d and Supplementary Fig. 6). Enrichment of 5hmC is also 
reduced at promoters in Np95/~ but not in Tet1/2 knockdown cells. 
However, maintenance of relative enrichment levels at promoters 
upon Tet1/2 knockdown means that its absolute 5hmC levels follow 
the observed genome-wide reduction in 5hmC (see Supplementary 
Fig. 9). Overall these results suggest that most 5amC in mouse ES cells 
is dependent on pre-existing 5mC (although we cannot exclude that 
5hmC may also be generated by an independent mechanism”), but 
that the kinetics of generating and maintaining 5mC and converting it 
to 5hmC are likely to be different for different genomic elements. 

Our protocol for (h)MeDIP-Seq conserves information on strand- 
specificity of hydroxymethylation and methylation (Supplementary 
Fig. 10). Extensive occurrence of strand-biased regions was found in 
both 5mC and 5hmC methylomes, and these regions were enriched for 
CpH (where H is C, A, or T) dinucleotides (Fig. 3a), indicating that 
strand-specific (hydroxy)methylation occurs largely in non-CpG con- 
text. The strand specificity and sequence context of asymmetric 
(hydroxy)methylation were confirmed by analysis of a BS-Seq ES cell 
data set’* (Fig. 3c) and by bisulphite sequencing of selected asymmetric 
regions, showing that modification occurred predominantly in CpH 
context, where it was entirely strand-specific (Fig. 3d). Overall strand 
bias in 5hmC profiles was increased in Np95_'~ and Tet1/2 knock- 
down cells (Fig. 3b), consistent with the dependence of ShmC on pre- 
existing 5mC at CpGs and suggesting that TET1 and TET2 may havea 
preference for oxidizing 5mC in CpG context. 
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Figure 2 | Genetic relationship between methylation and 
hydroxymethylation. a, Thin layer chromatography separation of 
radioactively end-labelled bases from MspI-digested genomic DNA, showing 
reduced levels of 5hmC (arrowheads) in methylation- and TET-deficient ES 
cells. b, glucMS-qPCR validation of genomic regions specifically enriched for 
5mC or 5hmC in wild-type (WT) J1, Np9s /— and TKO ES cells (bars represent 
mean + s.d.). Genomic regions were selected on the basis of (h)MeDIP-Seq 
profiles of wild-type ES cells. c, Examples of (h)MeDIP-Seq profiles in wild 
type, Np95_/~ and Tet1/2 KD ES cells. 5hmC profiles are relatively similar, 
whereas 5mC distribution is significantly altered in Np95"‘~ cells, but less so in 
Tet1/2 knockdown(Tet1/2 KD) cells. Shadowed areas highlight regions of 
altered ShmC and/or 5mC enrichment. d, Relative enrichment at promoters, 
exons and 5’ regions of LINE] elements in J1, Np95 ’~ and Teti/2 KD ES cells. 
Np95 deficiency causes depletion of both 5hmC and 5mC in all three regions, 
whereas Tet1/2 KD causes preferential reduction of S5hmC at exons and LINE1 
promoters, which leads to increased 5mC enrichment in these regions. Values 
represent means of two biological replicates with the ends of the error bars 
corresponding to the individual data points. 


We asked if there was a relationship between hydroxymethylation at 
gene promoters in ES cells and their transcription levels (Fig. 4a and 
Supplementary Fig. 11). We generated an ES cell transcriptome by 
RNA-Seq and classified promoters with respect to enrichment of 
5mC and 5hmC. Whereas the presence of 5mC in the promoter region 
was associated with low levels of transcription as expected, 5hmC was 
associated with high levels of transcription. In fact, genes specifically 
enriched for 5hmC were more highly transcribed than those with 
neither of the modifications (Fig. 4a); this effect is also partially depend- 
ent on promoter CpG density (Supplementary Fig. 11). Promoters 
enriched for both S5hmC and 5mC were also associated with higher 
levels of transcription than promoters specifically enriched for 5mC, 
suggesting that presence of 5hmC partially overcomes the silencing 
effect of 5mC. Consistent with these observations, promoters that are 
high in ShmC are enriched in the activating histone mark H3K4me3, 
whereas those enriched in 5mC are depleted of H3K4me3 (Fig. 4b; data 
from ref. 14). 5hmC in exons was also found associated with increased 
levels of transcription (Supplementary Fig. 11), consistent with what 
has been found in mouse cerebellum”. 

RNA-Seq of Tet1/2 knockdown ES cells identified 107 genes that 
were downregulated in knockdown cells (18 out of 22 validated by 
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Figure 3 | Strand specificity and sequence context of methylation and 
hydroxymethylation. a, Enrichment of dinucleotide sequences present in the 
central 200 bp of 5hmC and 5mC regions separated into biased and unbiased 
fragments. CG dinucleotides are enriched in unbiased regions, as expected from 
its symmetric nature. Biased regions are enriched for CH dinucleotides, 
indicating extensive non-CpG (hydroxy)methylation. b, Strand bias 
measurements, which represent the overall level of asymmetric methylation in 
the genome. Depletion of NP95 increases strand bias in the 54mC and 5mC 
profiles due to reduced CpG methylation. Knockdown of Tet1/2 decreases 
strand bias in the 5mC profile, as expected if reduction of 5hmC at CpGs leads 
to an accumulation of 5mC at the same sites. Values represent means of two 
biological replicates with the ends of the error bars corresponding to the 
individual data points. c, BS-Seq’’ validation of the (h)MeDIP data. Percentages 
of methylated CpGs present in 5hmC-enriched peaks containing low or high 
5mC levels are plotted (left) showing the symmetrical nature of CpG 
methylation (in both biased and unbiased peaks). Conversely, CpH 
methylation in biased peaks is asymmetric in nature (right). Error bars 
represent 95% confidence intervals. d, Validation of asymmetric methylation 
by bisulphite sequencing of biased and unbiased (h)MeDIP-Seq peaks. Filled 
squares represent methylated/hydroxymethylated cytosines and empty squares 
represent unmodified cytosines. Bisulphite sequencing confirms the 
asymmetric nature of methylation in biased regions (middle and bottom) and 
reveals extensive non-CpG methylation, whereas the unbiased region (top) 
contains mostly CpG methylation. 


quantitative PCR with reverse transcription (qRT-PCR); Fig. 4c and 
Supplementary Fig. 12). We also carried out Tet] knockdown on its 
own and found all 18 validated genes consistently downregulated by 
qRT-PCR (Supplementary Fig. 12), indicating that TET1 has the 
major role in the observed expression changes. This was confirmed 
in a stable ES cell line containing a doxycycline-inducible shRNA 
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Figure 4 | Gene expression and promoter methylation in ES cells and during 
differentiation. a, Relationship between 5hmC and 5mC levels at gene 
promoters and expression of downstream genes measured by RNA-Seq in J1 ES 
cells. Significance levels are relative to all promoters (**P < 0.001, 

***P < (0.0001 throughout the figure). b, Relationship between 5hmC, 5mC 
and presence/absence of H3K4me3 and H3K27me3 at gene promoters (data 
from ref. 14). c, RT-PCR validation of genes downregulated upon Tet1/2 KD 
across three biological replicates and expression level changes in the same genes 
from ES to EB differentiation (values are mean + s.d.). d, Induction ofa shRNA 
targeting Tet1 in a stable ES cell line also results in the downregulation of the 
genes in c. Restoring TET1 expression leads to recovery in expression of these 
genes (values are mean + s.d.). e, Co-immunostaining of mock and Tet1/2 KD 
ES cells for ShamC (red) and CDX2 (green). Scale bar, 20 jum. Cells were scored 
for presence of 5hmC and CDX2 expression (n = 1,120 and 1,209 for mock and 
Tet1/2 KD cells, respectively; values are percentage of cells + 95% confidence 
interval). f, KD/WT ratios for promoter 5mC and 5hmC shows that 
downregulated genes, and in particular gRT-PCR-validated ones, suffer 
methylation changes different from the pool of all genes. Genes downregulated 
upon ES to EB differentiation have increased 5mC enrichment levels and 
decreased 5hmC. g, Examples of 5amC and 5mC profiles of two genes in J1 
and E14 ES cells, Tet1/2 KD and E14 EBs, and corresponding quantification of 
5mC and 5hmC levels by glucMS-qPCR (the MspI site used is indicated by 
the grey arrowhead). A considerable reduction in 5hmC levels is detected 
upon both Tet1/2 KD and differentiation into EBs, with a concomitant 
increase in 5mC. 


©2011 Macmillan Publishers Limited. All rights reserved 


against Tet] in which we found again downregulation of the same set of 
genes (Fig. 4d). Removal of doxycycline led to recovery of TET1 and 
5hmC levels back to normal and we found that gene expression changes 
were restored to wild-type levels (Fig. 4d and Supplementary Fig. 13). 

The genes that were downregulated in response to Tet1/2 knockdown 
included pluripotency-related genes such as Esrrb, KIf2, Tcll, Zfp42, 
Dppa3, Ecat1 (also known as 2410004A20Rik) and Prdm14 (by contrast 
Nanog, Oct4 (also known as Pou5f1) and Sox2 were not downregulated). 
These are amongst the earliest genes to be downregulated upon ES cell 
differentiation’®, and include genes that undergo epigenetically regulated 
transcriptional fluctuations in ES cells, such as Dppa3 and Zfp42 (refs 17, 
18). It remains to be seen whether such fluctuations are associated with 
stochastic patterning of ShmC at these promoters across a cell popu- 
lation. Consistent with a role in regulating transcription of pluripotency- 
associated genes, the upstream region of the Tet1 gene contains a large 
cluster of binding sites for core pluripotency transcription factors 
(Supplementary Fig. 14), and both TET1 and TET2 are rapidly down- 
regulated upon differentiation to EBs (Supplementary Fig. 15). Indeed, 
we found a 23-fold enrichment (P < 2.2 X 10° 1°) for transcripts down- 
regulated upon Tet1/2 knockdown to also be substantially downregu- 
lated during ES cell differentiation”’, 16 of which we validated by qRT- 
PCR (Fig. 4c and Supplementary Fig. 12). 

Although ES cells subjected to Tet1/2 knockdown did not appear to 
differentiate spontaneously, we found that markers of extraembryonic 
endoderm differentiation (Gata6 and Sox17) were precociously 
expressed when Tet1/2 knockdown ES cells were differentiated with 
retinoic acid (Supplementary Fig. 16). This is consistent with down- 
regulation of Esrrb and Prdm14, which safeguard ES cells from com- 
mitment to endoderm cell fate”. We also found a significant increase 
of CDX2-positive cells concomitant with the decrease in the number of 
5hmC-positive cells upon Tet1/2 knockdown (all CDX2-positive cells 
were particularly low in 5hmC, Fig. 4e). These results are in agreement 
with recent studies using Tet! knockdown ES cells*®. 

Promoters of genes that were downregulated during ES cell differ- 
entiation had a marked decrease of 5hmC enrichment levels, which 
was accompanied by a significant increase in 5mC levels (Fig. 4f, g). 
Notably, genes that were downregulated in Tet1/2 knockdown cells 
had robustly increased levels of 5mC in their promoters (Fig. 4f, g; see 
also Supplementary Fig. 17), suggesting that decline of TET levels 
during differentiation leads, at least in part, to gene silencing through 
methylation of promoters. Enrichment of 5hmC in these promoters 
was unchanged, that is, its absolute levels accompanied the genome- 
wide loss of 5hmC. Importantly, we confirmed the reduction of SamC 
levels by glucMS-qPCR of selected promoters (Fig. 4g). 

Our study shows that 5hmC is relatively enriched in euchromatic parts 
of the genome, including in CGIs and promoters, and its presence in 
promoters and exons is associated with increased levels of transcription. 
This may in part be explained by the removal of the repressive effects of 
5mC, but it is also possible that 5hmC itself has a positive effect on 
transcription. Furthermore, the TET proteins may have roles in tran- 
scriptional regulation in addition to their ability to convert 5mC to 5hmC. 
The core pluripotency network is connected with TET 1/2 which regulate 
genes with established roles in pluripotency and epigenetic reprogram- 
ming such as Esrrb, Dppa3, KIf2, Zfp42 and Prdm 14, thus safeguarding ES 
cells against commitment to extraembryonic cell fate (Supplementary 
Fig. 19). On the basis of these results, we suggest that hydroxymethylation 
and the TET proteins could also have a role in erasing methylation marks 
from promoters of pluripotency-related genes during fusion of ES cells 
with somatic cells*”*, and during the generation of induced pluripotent 
stem cells for which erasure of DNA methylation seems critical”*. 
Hydroxymethylation may also have a role in the large-scale erasure of 
methylation in primordial germ cells and early embryos”*”*. 


METHODS SUMMARY 


All immunofluorescence and (h)MeDIP-Seq data were produced using a rabbit 
anti-5hmC polyclonal antibody (Active Motif, catalogue no. 39769) and a mouse 
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anti-5mC monoclonal antibody (Eurogentec, MMS-900P-B). J1, E14, Np95! ~ 
and TKO ES cells were grown under standard conditions in the presence of 
serum and leukaemia inhibitory factor (LIF). Differentiation of E14 ES cells into 
embryoid bodies was done by removal of LIF and suspension culture for 13 days. 
RNA interference (RNAi) experiments were performed as described’ with modi- 
fications. A2lox.cre ES cells” were targeted with a short hairpin RNA-micro RNA 
(shRNA-mir) sequence against Tet1 cloned into p2Lox; induction of Tet1 knock- 
down and recovery were achieved by the addition and later removal of doxycy- 
cline. Cells were either fixed for immunofluorescence or collected for DNA and/or 
RNA extraction. MeDIP-Seq and hMeDIP-Seq were based on the MeDIP 
method”* using either the anti-5mC or anti-5hmC antibodies, respectively, but 
incorporating the ligation of Illumina adaptors for paired-end sequencing, which 
was performed on a Illumina Genome Analyzer GAIIX. RNA-Seq was performed 
as described previously” with modifications. Bioinformatic analyses were per- 
formed using SeqMonk (http://www.bioinformatics.bbsrc.ac.uk/projects/seq- 
monk/) and custom Perl or R scripts. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Antibody validation by dot blot and methylated/hydroxymethylated DNA 
immunoprecipitation assay (MeDIP/hMeDIP). All immunofluorescence and 
(h)MeDIP-Seq data were produced using a rabbit anti-ShmC polyclonal antibody 
(Active Motif, catalogue no. 39769) and a mouse anti-5mC monoclonal antibody 
(Eurogentec, MMS-900P-B). To generate control templates, PCR fragments were 
amplified from M13mp18 DNA or custom oligonucleotides using either dCTP, 
d5mCTP or d5hmCTP. For dot blot, analysis DNA samples were denatured at 
99°C for 5min and spotted onto Hybond-N+ nitrocellulose membranes (GE 
Healthcare). After ultraviolet cross-linking membranes were blocked overnight 
with 10% non-fat milk and 1% BSA in PBT (PBS + 0.1% Tween20) at 4°C 
followed by >1h incubation with either the 5mC or 5hmC antibodies (1:500 in 
blocking solution) at room temperature. Membranes were washed four times with 
PBT, incubated for 30 min with horseradish peroxidase (HRP)-conjugated goat 
anti-mouse or anti-rat antibodies (GE Healthcare; 1:10,000 in blocking solution), 
washed with PBT, and developed using the ECL+ detection system (GE 
Healthcare). For (h)MeDIP, three control templates with different sequences 
(~200-bp products containing C, 5mC or 5hmC; 15 pg each) were mixed with 
sonicated genomic DNA (1.5 1g) followed by denaturation (10 min at 95 °C) and 
immunoprecipitation as described previously for MeDIP”* using 2 jig of anti-5mC 
or anti-ShmC antibody and 8 pl Dynabeads (coupled with M-280 sheep anti- 
mouse IgG for the 5mC antibody or with Protein G for the 5hmC antibody, 
Invitrogen). Pulled-down products were detected by qPCR and normalized to 
the unbound fraction. 

Cell lines and other biological samples. J1 ES cell line (129S4/SvJae) was pur- 
chased from ATCC (catalogue no. SCRC-1010) and grown on a y-irradiated 
pMEF feeder layer at 37°C and 5% CO, in complete ES medium (DMEM 
4,500 mgl~' glucose, 4mM L-glutamine and 110mg1~' sodium pyruvate, 15% 
fetal bovine serum, 100 U of penicillin/100 j1g of streptomycin in 100 ml medium, 
0.1mM non-essential amino acids, 501M {-mercaptoethanol, 10°U LIF 
ESGRO). E14 ES cells were grown either in complete ES medium or differentiated 
for 13 days into embryoid bodies via LIF removal and suspension culture. Retinoic 
acid (RA) differentiation of the mock siRNA treated and Tet1/2 knockdown ES 
cells was done with 21M end concentration of RA in complete ES medium 
without LIF for 24h. Np95! ~ ES cells (129/Ola derived) and TKO ES cells 
(Damt3a_‘~, Dnmt3b~'~ and Dumt1‘~, J1-derived, gift from M. Okano*°) were 
grown in complete ES medium. Primary mouse embryonic fibroblasts were 
derived from E11.5 embryos (B6CBAF1 X B6) and grown for three passages in 
DMEM 4,500 mg! ' glucose, 4 mM 1-glutamine and 110 mg]! sodium pyruvate, 
10% fetal bovine serum, 100 U of penicillin/100 ug of streptomycin in 100 ml 
medium, 504M f-mercaptoethanol. Cerebellum sections were a gift from E. 
Ivanova and G. Kelsey. 

RNAi knockdown of Tet1 and Tet2 in ES cells. RNA interference experiments 
were performed as described’ with modifications. Transfections of Dharmacon 
siGENOME siRNA duplexes (Thermo Fisher Scientific) against mouse Tet1 (cata- 
logue no. D-062861-01; caacuugcauccacgauua), siGENOME SMARTpool against 
Tet2 (catalogue no. M-058965-01; gaaagcagcucgaaagcgu, ccucagauauuuauggaga, 
aculacuaacuccacccuaa, uagcaacguuuucuccuua) and siGENOME non-targeting 
siRNA#2 (catalogue no. D-001210-02; sequence not available) were done with 
Lipofectamine 2000 according to the manufacturer’s instructions. Cells were har- 
vested after three rounds of transfection for DNA/RNA isolation. 

Stable shRNA Tet1 knockdown and recovery. The pSM2 retroviral vector con- 
taining the shRNA-mir sequence targeting the Tet] messenger RNA (tgctgttgac 
agtgagcgcgctagctatagagtatagtaatagtgaagccacagatgtattactatactctatagctagcttgcctactgce 
ctcgga) was purchased from Open Biosystems. shRNA-mir sequences were amp- 
lified by PCR using primers that created restriction sites for HindIII and NotI 
(atcaagcttcagggtaattgtttgaatgagec and agcggccgcgtcttccaattgaaaaaagtga), cloned 
into p2Lox*! and subsequently transfected into A2lox.cre ES cells” (derived from 
the E14 cell line strain 129P2/OlaHsd, provided by M. Kyba). One day before 
transfection, Cre expression was induced by adding doxycycline (0.5 1g ml ') to 
the complete ES cell medium to promote the stable site-specific integration of the 
shRNA-mir sequence, following inducible cassette exchange recombination, which 
renders the cells resistant to neomycin. 

ES cells were transfected using Lipofectamine 2000 (Invitrogen) at a concentra- 
tion of 5 X 10° cells ml” '. One day after transfection, selection medium containing 
geneticin (G418, Melford; 300 pg ml | active concentration) was added to the cells 
and selection was maintained for 10 days. Resistant colonies were then individually 
picked into 96-well plates and expanded for freezing of stable cell lines. Integration 
was confirmed by PCR using Loxin primers”. shRNA-mir expression was induced 
by adding 2 1g ml of doxycycline (Sigma) to the culture medium during 5 days 
followed by removal of the doxycycline from the ES medium for an additional 
7 days. RNA and DNA were isolated before removing the doxycycline to evaluate 
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knockdown effects and after 7 days of recovery, using a Qiagen AllPrep DNA/RNA 
isolation kit. 

Immunofluorescence, microscopy and image analysis. Antibody staining of 
DNA methylation and hydroxymethylation was performed as previously 
described** with modifications. Briefly, cells were fixed with 4% PFA for 15 min 
and, after permeabilization with 0.5% Triton X-100, the samples were treated with 
4N HCl for 10min at room temperature, washed in PBS Tween and blocked 
overnight; simultaneous incubation with both primary antibodies followed by 
simultaneous secondary detection was used. Antibody staining against CDX2 was 
performed as previously described**. Mouse cerebellum cryosections (30 jum) were 
fixed with methanol (20 min at -20 °C) before staining with anti-calbindin D28K 
(CBP; gift from P. Emson) and post-fixing with 2% PFA after which same protocol 
as above was used for ShmC staining. Single optical sections were captured with a 
Zeiss LSM510 Meta microscope (X63 oil-immersion objective) and the images 
pseudo-coloured using Adobe Photoshop. Semi-quantification of signals was per- 
formed on single optical sections using Volocity5.2 (Improvision). 

DNA/RNA extraction. Genomic DNA was prepared using the Qiagen AllPrep 
DNA/RNA mini kit. RNA was extracted using either the Qiagen AllPrep DNA/ 
RNA mini kit or RNeasy mini kit and subjected to DNase treatment using the 
Ambion DNA-free kit according to the manufacturers’ instructions. 

(h)MeDIP and next generation sequencing. The (h)MeDIP-Seq protocol was 
performed as described above with the following modifications: after sonication of 
gDNA the ends of the DNA fragments were repaired and paired-end sequencing 
specific adaptors (Illumina) were ligated using either a Paired-End DNA Sample 
Preparation Kit (Illumina) or NEBNext DNA Sample Prep Reagent Set 1 (NEB). 
Following adaptor ligation, DNA was immunoprecipitated and purified. 
Fragments were amplified with 12-18 cycles using adaptor specific primers 
(Illumina); fragments ranging between 300 and 500 bp in size were gel-purified 
before cluster generation and sequencing. Sequencing was done on an Illumina 
Genome Analyzer GAIIX using Cluster Generation v2 and 4 chemistries as well as 
Sequencing by Synthesis Kits v3 and v4. Data collection was performed using 
Sequencing Control Software v2.5 and 2.6. Real-time Analysis (RTA) 1.5-1.8 were 
used for base calling. Genomic mapping of short reads was performed using the 
sequence_pair mode of ELAND in the Illumina CASSAVA pipeline v1.5-1.8. 
Details on the number of sequencing reads obtained for each run are shown in 
Supplementary Table 1. 

Bisulphite sequencing. J1 ES cell genomic DNA was bisulphite-treated using the 
Qiagen Epitect Kit and amplified using either Qiagen Hotstar Taq DNA 
Polymerase or Roche High Fidelity DNA Polymerase (primer sequences in 
Supplementary Table 2). A single amplification band was excised from the agarose 
gel and cloned into pGEM-T Easy vector (Promega) for sequencing. 

Sequenom MassARRAY. Genomic DNA was bisulphite treated using the Qiagen 
Epitect Kit. PCR amplification of target regions (primer sequences in Supplemen- 
tary Table 2), in vitro transcription and cleavage of the products for MassARRAY 
analysis were performed according to the manufacturer’s instructions. 
Glucosylation of genomic 5hmC followed by methylation sensitive qPCR 
(glucMS-qPCR). Genomic DNA (1 lg) was treated with T4 Phage B-glucosyl- 
transferase (T4-BGT, NEB M0357S) according to the manufacturer’s instructions. 
Glucosylated genomic DNA (100 ng) was digested with 10 U of either Hpall, MspI 
or no enzyme (mock digestion) at 37°C overnight, followed by inactivation for 
20 min at 80°C. The Hpall- and MspI-resistant fraction was quantified by qPCR 
using primers designed around at least one Hpall/MspI site, and normalizing to the 
mock digestion control and two regions lacking HpalII/Mspl sites (Supplementary 
Table 2). Resistance to MspI directly translates into percentage of ShmC, whereas 
5mC levels were obtained by subtracting the 5hmC contribution from the total 
Hpall resistance’’. 

5hmC detection using thin layer chromatography (TLC). Detection of 54mC 
within Mspl sites was done as described previously~. Briefly, 1 ig of genomic DNA 
was restriction-enzyme-digested with 20 U Mspl and 10 tg RNase A overnight at 
37 °C, followed by inactivation of the enzyme at 65 °C for 20 min. DNA fragments 
were dephosphorylated with shrimp alkaline phosphatase and purified using 
QIAquick PCR purification kit followed by radioactive end labelling with *’P- 
ATP (10uCi, 3.3 pmol) using T4 Polynucleotide kinase for 1h at 37°C. 
Radioactively labelled DNA was precipitated, resuspended in 18 jl DNase I buffer 
and fragmented to single nucleotides with 1 tl DNase 1 (10 U pl - '; Roche) and 1 pl 
SVPD (10 pig jul’; Worthington) for 3h at 37 °C. Samples of 1-5 pl were spotted 
onto PEI cellulose F TLC plates and developed with isobutyric acid:H,O:NH3 
(66:20:1 v/v/v) overnight followed by drying of the plate and exposure of radio- 
activity on an imaging film. 

mRNA library preparation for next generation sequencing (RNA-Seq). mRNA 
was isolated from 3 yg total RNA using Dynabeads mRNA DIRECT (Invitrogen) 
and fragmented with RNA fragmentation reagent (Ambion). First strand cDNA 
synthesis was done with SuperScript III First-Strand Synthesis System and 3 pg pl 
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random hexamers (Invitrogen) followed by second strand synthesis with DNA 
Polymerase I and RNase H. After purification, a sequencing library was generated 
from the double stranded cDNA using paired-end adaptors (Illumina) and 
NEBNext DNA Sample Prep Reagent Set 1 (NEB), and sequenced following a 
single-end sequencing protocol”’. Sequencing was done on an Illumina Genome 
Analyzer GAIIX using Cluster Generation v4 chemistry and Sequencing by 
Synthesis Kit v4. Data collection was performed using Sequencing Control 
Software v2.6. Real-time Analysis (RTA) 1.6 was used for data monitoring. 
Spliced mapping of RNA-Seq data was performed with TopHat v1.0.14 using 
default parameters*’. Details on the number of sequencing reads obtained for each 
run are shown in Supplementary Table 1. 

Quantitative reverse transcription PCR. RNA was extracted using the Qiagen 
RNeasy Mini kit and subjected to DNase treatment using the Ambion DNA-free 
kit according to the manufacturers’ instructions. cDNA was constructed from 2 1g 
of this RNA using the SuperScript III First-Strand Synthesis System for RT-PCR 
using random hexamers to prime the reaction. This cDNA was diluted 1:50 and 
used as template for quantitative real-time PCR reactions in combination with 
Brilliant II SYBR Green QPCR Master Mix and primers designed to specifically 
amplify a small product (intron-spanning where possible) for each gene of interest 
(Supplementary Table 2). Cycling reactions were performed in duplicate and cycle 
threshold (C,) fluorescence data recorded on a Stratagene Mx3000P thermal cycler 
and Bio-Rad C1000 Thermal Cycler. The relative abundance of each gene of 
interest was calculated on the basis of the AAC, method*’, where results were 
normalized to the average C, of two housekeeping genes with consistent C, values 
over all samples (Atp5b and Hsp90ab1). 

Bioinformatics. Analysis of gene region enrichment. An initial identification of 
enriched clusters was performed on a data group, which combined the reads from 
all of the individual data sets. Clusters were identified where the density of reads 
represented an enrichment of >1 fold (equating to 19 reads) over at least 50 bp. 
Where adjacent clusters were within 20 bp of each other they were combined. All 
clusters were then quantified with log,-transformed count of the number of 
overlapping reads. The read counts were normalized both to the total count of 
the largest data set by applying a linear scaling factor, and by the length of the 
cluster. Clusters with a quantified value of more than 12 in any sample were 
removed because these represented an unrealistically high level of enrichment 
which was most likely due to mismapping of data. Each cluster was then called as 
present or absent in each sample. A cutoff of 5 (approximately equivalent to the 
median in most samples) in the corrected counts was taken as the point above 
which a cluster would be said to be present. All clusters were divided into groups 
based on the position of the centre of the cluster falling into a promoter, gene, 
exon, intron or intergenic region. Finally counts were made for the number of 
present clusters per kilobase of sequence in each of the different genomic region 
classes. Final enrichment values were calculated as the log, ratio of the clusters 
per kb in the selected genomic region compared to the whole genome cluster 
density. Similar results were obtained when normalizing the data to an unbound 
fraction of a MeDIP experiment subjected to sequencing, ruling out any potential 
mapping effects. 

Strand bias analysis. Clusters were generated as for the region enrichment ana- 
lysis. For each cluster a count was made for the number of forward and reverse reads 
overlapping with that cluster in every sample. A cluster was discarded in a sample if 
the total count of forward plus reverse was less than 20 or greater than 200. For each 
valid cluster a bias value was calculated as abs(log,((f+ 1)/(r + 1))) — c(f+n) 
where c(f+ r) represents an averaged value from a simulation of observed strand 
biases from clusters with different numbers of reads, but where the probability of 
reads being either forward or reverse was exactly 0.5. The valid bias values were then 
attributed to whichever genomic regions they fell into and an average value for each 
genomic region for each sample was calculated. High values would indicate a higher 
than expected level of asymmetry in individual clusters and a zero value would 


indicate a completely symmetrical sample. A control sample of sonicated input 
DNA showed a mean bias of +0.021 (data not shown). 

Repeat analysis. For non-directional repeat analysis all currently known 
instances of repeats were retrieved from the Ensembl database (totalling nearly 
9.5 million), and the sequence information for the major types of repeats were 
concatenated to form individual repeat genomes. The repeat content of different 
samples was determined using in-house developed software that aligns all 
sequences of Illumina sequence files to the entire repeatome employing multiple 
instances of Bowtie’. The number of aligning sequences was counted for each 
repeat type individually. 

RNA-Seq analysis. Initial RNA-Seq quantification was performed over each 
exon and was expressed as the number of overlapping bases of sequence per base 
of exon. The values were normalized to total read count between samples and an 
overall expression level was calculated for each transcript by normalizing the total 
expression value of the exons in that transcript to its total exon length. On the Tet1/ 
2 knockdown experiment we found that effects in expression were proportional to 
the amount of measured knockdown, and the first of three biological replicates had 
considerable lower amounts of Tet] and Tet2 transcripts when compared with the 
remaining. We therefore classified a transcript as differentially expressed if: (1) its 
expression level was above 2.5; (2) it was up- or downregulated in the first replicate 
by >1.5-fold relative to both untransfected and mock KD controls; (3) it was up- 
or downregulated by >1.15-fold in the other two replicates. Despite these low 
thresholds 18 out of 22 downregulated genes were validated by qRT-PCR in five 
biological replicates. For the comparison of the published ES and EB data sets’’, we 
classified transcripts as differentially expressed if their expression ratio was larger 
than fourfold. For correlation of expression with 5mC and 5hmC levels, promoters 
were classified as 5mC- or ShmC-high if their normalized read count (log,) was 
above 5.5 and 5mC- or 5hmC-low if it was below 4.5 (medians were between 4.2 
and 5.5). Similar results were obtained with a large range of threshold values or by 
using quantile-based thresholds. 

BS-Seq analysis. A mouse ES cell shotgun BS-Seq data set’ was obtained 
from GEO (accession number GSE19960) and remapped to the mouse genome 
(build NCBIM37) using Bismark (http://www. bioinformatics. bbsrc.ac.uk/projects/ 
bismark). Prior to performing alignments the first 5 bp of all reads in the mouse ES 
cell shotgun data were clipped off to remove adaptor sequence. To validate our 
results with the above mentioned publically available BS-Seq data set, clusters were 
generated as described for the region enrichment analysis. In addition, peaks iden- 
tified in (h)MeDIP-Seq were further subdivided into ‘ShmC regions with low 5mC’ 
and ‘ShmC regions with high 5mC’. Peaks with an abnormally high coverage of 
bisulphite reads were identified with a box-whisker distribution filter and extreme 
outliers (ratio of >threefold above median) were excluded from the analysis. The 
read coverage and methylation levels underlying these peaks or different genomic 
features were analysed using SeqMonk. 


ti 


30. Tsumura, A. et al. Maintenance of self-renewal ability of mouse embryonic stem 
cells in the absence of DNA methyltransferases Dnmt1, Dnmt3a and Dnmt3b. 
Genes Cells 11, 805-814 (2006). 

31. lacovino, M. etal. A conserved role for Hox paralog group 4 in regulation of 
hematopoietic progenitors. Stem Cells Dev. 18, 783-792 (2009). 

32. Ting, D.T. et al. Inducible transgene expression in mouse stem cells. Methods Mol. 
Med. 105, 23-46 (2005). 

33. Santos, F. et a/. Epigenetic marking correlates with developmental potential in 
cloned bovine preimplantation embryos. Curr. Biol. 13, 1116-1121 (2003). 

34. Ng,R.K. etal. Epigenetic restriction of embryonic cell lineage fate by methylation of 
Elf5. Nature Cell Biol. 10, 1280-1290 (2008). 

35. Trapnell, C., Pachter, L. & Salzberg, S. L. TopHat: discovering splice junctions with 
RNA-Seq. Bioinformatics 25, 1105-1111 (2009). 

36. Livak, K. J. & Schmittgen, T. D. Analysis of relative gene expression data using real- 
time quantitative PCR and the 2-44". Methods 25, 402-408 (2001). 

37. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient 
alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 
(2009). 


©2011 Macmillan Publishers Limited. All rights reserved 


H. WANG/CHURCH LAB/HARVARD UNIV. 


THE NEXT STEP FOR THE 
SYNTHETIC GENOME 


Biologists have copied an existing genetic code, but haven’t yet commercialized it or 
written their own. What will it take for a tour de force to reach industrial force? 


BY MONYA BAKER 


year ago this week, headlines trumpeted 
A humans had created artificial life. 

Scientists at the J. Craig Venter Institute 
in Rockville, Maryland, had chemically syn- 
thesized DNA and placed it inside a bacterial 
cell emptied of its own genetic material. Tests 
a few days after the insertion showed that the 
1-million-base-pair-long synthetic genome was 
able to run the cellular machinery". 

Whole-genome engineering could one day 
create cells unbound by biochemistry as we 
know it, says George Church, a geneticist at 
Harvard Medical School in Boston, Massachu- 
setts. Researchers might even be able to design 
a new genetic code, one that could incorporate 
more than the 20 or so amino acids used by nat- 
ural living systems. That achievement is “going 
to be more than an increment’, says Church, 
“that’s going to be a game-changer”. But cur- 
rent reality is more prosaic. As Venter Institute 
staff celebrated their cell’s first birthday with 
a chocolate-and-spice layer cake topped by a 
miniature microscope made of sugar, they were 
well aware that the era of synthetic genomes 
still faces plenty of growing pains. 

Breathless headlines notwithstanding, the 
Venter Institute team did not create life so 
much as copy an existing plan. In this case, they 
acted more like scribes than authors. Synthetic 
biologists are also working on changing DNA 
sequences — trying to engineer microbes for 
practical applications such as decontaminating 
toxic waste, tracking down tumours or secret- 
ing biofuels — but few work with more than 
ten genes at a time. The story of the field so far 
is, “can write DNA, nothing to say’; says Drew 
Endy, a synthetic biologist at Stanford Univer- 
sity in California. “We can compile megabases 
of DNA, but no one is designing beyond the 
kilobase scale?” 

“Most of us are still working on a small scale 
because there are interesting questions there 
and because that’s what we have the technology 
to build,’ says James Collins, a biomedical engi- 
Strains of Escherichia coli have been developed to produce lycopene, an antioxidant found in tomatoes. neer at Boston University in Massachusetts. 
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> “We frankly don't understand biology well 
enough to start designing genomes de novo.” 

Many technologies must fall into place before 
researchers will be able to routinely work with 
even tens of genes at a time. Putting together 
huge DNA molecules is time-consuming and 
expensive, and designing biological compo- 
nents to perform a particular task is a challenge 
for parts of genes, let alone whole genomes. 
Transplanting DNA molecules into cells is not 
easy, nor is getting the DNA to ‘boot up’ once 
it is in place. And because the genomes will be 
far from perfect, researchers will need ways to 
tweak and test many variants. 

No one expects fast results, and much of 
the work will be tedious. The Venter Institute 
spent 15 years and US$40 million creating the 
technology to build and transplant a genome. 
The 2010 paper lists two dozen authors. “This 
was a debugging process from the beginning,” 
says Craig Venter, founder of the institute. 
“99% of our experiments failed” And failed 
experiments were costly: a single error ina 
million base pairs set the project back months. 
Not counting scientists and their equipment, 
four species were involved in the genome 
transplant: Mycoplasma mycoides to provide 
the source code, Escherichia coli to copy DNA 
pieces, baker’s yeast (Saccharomyces cerevisiae) 
to assemble them into a million-base-pair cir- 
cle and Mycoplasma capricolum to provide the 
recipient shell. No wonder more synthetic biol- 
ogists are thinking about parts of genes than 
are dreaming of constructing whole genomes. 


LEARNING TO WRITE GENOMES 
Synthetic biology often adopts the language 
of engineers: rather than talking about genes, 
networks and biosynthetic pathways, prac- 
titioners prefer to talk about parts, devices 
and modules. ‘Parts’ refer to the protein- 
coding section of a gene and sundry regula- 
tory sequences that tune gene expression. A 
‘device’ is an assembly of parts that together 
perform a particular function, often turning a 
protein's production on or off. Anda ‘module’ 
or pathway is a collection of devices that carry 
out more-complex functions, such as coordi- 
nating a chemical synthesis or shunting cells 
between ‘growth and ‘production modes. 

Jeff Hasty, a bioengineer at the University 
of California, San Diego, used three genes to 
make bacteria light up in sync’. Each gene is 
indirectly activated by the same small mol- 
ecule: one controls the production of the 
molecule, another directs its degradation and 
the third makes the fluorescent protein that 
causes the cell to flash. The molecule diffuses 
between cells and coordinates bursts of pro- 
tein production. 

Another example of bioengineering involves 
a dozen or so genes from multiple species. Jay 
Keasling, a biologist at the University of Cali- 
fornia, Berkeley, engineered E. coli and yeast 
cells to make a precursor of the malaria drug 
artemisinin at one-tenth of the cost incurred 


SYNTHETIC GENOMES apse sie)Reley 


Engineered bacteria produce flashes of fluorescence, controlled by three genes in a synchronized circuit. 


by the conventional method of production: 
extracting the natural product from sweet 
wormwood’. (More importantly, microbes 
grow faster than the plants, which are in lim- 
ited supply.) Sanofi-aventis in Paris and the 
Institute of One World Health in South San 
Francisco, California, plan to start distribut- 
ing the synthetic form of artemisinin next year. 

But Keasling’s achievement is an object les- 
son in the time and expense involved. He and 
his colleagues began work a decade ago and had 
a $43-million grant from the Bill & Melinda 
Gates Foundation in Seattle, Washington, to 
Keasling’s lab and to Amyris Biotechnologies 
in Emeryville, California, a company that Keas- 
ling co-founded in 2003. Researchers had to 
track down a previously unidentified enzyme, 
and engineered a dozen further yeast enzymes 
not just to workin E. coli, but also to operate at 
the right levels to move chemical intermediates 
towards a desired product without poisoning 
the cell or wasting resources. 

To speed up such projects, the Massa- 
chusetts Institute of Technology (MIT) in 
Cambridge maintains a Registry of Standard 
Biological Parts (http://partsregistry.org) 
that lists thousands of components. However, 
descriptions of these parts are often incom- 
plete, and they dont all work as described. 

To address this issue, Endy and bioengineers 
from the University of California, Berkeley, 
launched the International Open Facility 
Advancing Biotechnology (BioFab) in Emery- 
ville in 2009, with a grant from the National Sci- 
ence Foundation. The BioFab aims to boost the 
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supply of working parts, both by optimizing the 
parts themselves and by developing systems to 
swiftly design genetic constructs. The goal, says 
Endy, is to create a set of genetic regulatory ele- 
ments to precisely control the rates and levels of 
protein production. The BioFab currently pro- 
vides 350 promoters, grouped into ten levels of 
protein production. Having a range of options 
is important, says Endy, because using the 
same sequences multiple times makes genetic 
constructs unstable. The team is assessing how 
these elements behave in systems and under dif- 
ferent E. coli growth conditions. 

Eventually, the researchers hope to create 
vast libraries combining variants of different 
parts. This will let them compare the parts’ 
performances, and pick the best ones. Com- 
puter analysis will then be used to model how 
different sequences affect gene expression, 
which can in turn predict how new combina- 
tions of parts will function. But the in silico 
design process can go only so far. “Models are 
not yet as predictive as they could be,” says 
Adam Arkin, a bioengineer at Berkeley and 
co-director of the BioFab. “In almost all cases 
of real application we are faced with some tink- 
ering,” he adds. And the more parts are com- 
bined, the more unpredictable are the results. 


SOME ASSEMBLY REQUIRED 

The biological parts are generally easy to come 
by — short stretches of DNA can be ordered 
from a variety of companies (see ‘Making 
DNA on the cheap’) — but physically assem- 
bling multiple parts can be cumbersome and 
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Making DNA on the cheap 


The description of the first chemical 
synthesis of a gene took up an entire 

issue of the Journal of Molecular Biology 

in December 1972. Then, copying an 
oligonucleotide of just 20 base pairs took a 
chemist two years!!. Now, researchers can 
order oligonucleotides 50-200 base pairs 
long from several vendors, for shipment 
within 24 hours. 

“We ship tens of thousands of 
oligonucleotides every single day. That didn’t 
happen 20 years ago,” says John Havens, 
vice-president of business development at 
Integrated DNA Technologies in Coralville, 
lowa, which supplied the oligonucleotides 
used by the J. Craig Venter Institute in 
Rockville, Maryland, to assemble the mouse 
mitochondrial genome’. 

For an extra cost, oligonucleotides can 
be assembled into synthetic genes up to 
1,500 base pairs long, which are stitched 
into plasmids — circles of DNA that can 
be propagated in bacteria. Companies 
including Integrated DNA Technologies, 
GeneArt in Regensburg, Germany, OriGene 
in Rockville, Maryland, and DNA2.0 in 
Menlo Park, California, can make genes, 
check for errors and ship them to clients in 
two weeks or less. 

Prices have fallen, but much more 
slowly than those of multiplex gene 
sequencing (see ‘Reading and writing 
DNA’). Ten years ago, synthesis cost 
US$25 per base and sequencing cost 
$0.25, according to data compiled by 
Rob Carlson, a principal at Biodesic, a 
consulting firm in Seattle, Washington. 
Last August, the figures were $0.35 per 
base for synthesis and $0.00000317 
for sequencing. Synthesis at prices 
comparable to sequencing would 
greatly accelerate synthetic biology, says 
Christopher Anderson, a genetic engineer 
at the University of California, Berkeley. 

Cheaper DNA should be coming. 


READING AND WRITING DNA 


The price of sequencing has fallen faster than 
that of making genes and oligonucleotides. 
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Oligonucleotides have conventionally been 
synthesized individually on glass-column 
supports, but now microarrays allow 
thousands of molecules to be assembled 
side-by-side; microfluidics, microelectrodes 
or even tiny beams of light can be 
programmed to direct the desired chemical 
reaction to each spot on the microarray. 
LC Sciences in Houston, Texas, MYcroarray 
in Ann Arbor, Michigan, and CustomArray 
in Mukilteo, Washington, all offer 
oligonucleotide libraries synthesized on 
microarrays; so does Agilent Technologies 
in Santa Clara, California, but only to a 
limited number of collaborators. 

So far, oligonucleotides produced on 
microarrays have been difficult to use for 
gene synthesis: only very small amounts of 
each DNA sequence are made; thousands 
of sequences are mixed together and 
hard to separate; and the error rate is 
usually far higher than that of conventional 
techniques. 

Work by George Church, a geneticist 
at Harvard Medical School in Boston, 
Massachusetts, and Jingdong Tian, a 
synthetic-systems biologist at Duke 
University in Durham, North Carolina, 
may help to change that. Tian designed 
a microarray in which groups of 60-mer 
oligonucleotides are synthesized in small 
wells, facilitating their assembly into genes. 
These microarrays could make many 
variants of the same gene, which could be 
quickly assessed for desirable properties, 
such as high levels of protein expression’. 
Currently, the costs of gene synthesis can 
become prohibitive for such experiments. . 

Separately, Church and colleagues at 
Agilent, the Wyss Institute for Biological 
Engineering in Boston and Stanford 
University in California took another 
approach to assembling genes from a 
microarray. By carefully designing the 
oligonucleotides and PCR primers that 
target them, they could selectively amplify 
the oligonucleotides necessary for genes’®. 
This method successfully made 40 out of 
an attempted 42 genes for single-chain 
antibodies, a particularly challenging target. 

Achieving that kind of performance on 
a commercial scale will be difficult, but the 
Wyss Institute is testing the waters. Later 
this year, it plans to use the technique to 
launch a service that will construct genes 
for as little as $10 for a 500-base-pair 
piece of DNA. “We are interested in what 
the scientific community can do with very 
cheap, though imperfect, sources of DNA,” 
says Church. .8. 
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expensive. DNA molecules are either designed 
using complementary DNA sequences or 
mixed in with DNA complementary to 
the opposing ends of the molecules that 
are to be joined. These are combined with 
enzymes that cut and join DNA. Research- 
ers can link elements using a system called 
BioBricks, in which sequences are cut out of 
circular genetic elements called plasmids by 
restriction enzymes specific to a particular 
series of nucleotides at the start and end of 
the sequence. The desirable parts are then 
stitched together into larger plasmids by other 
enzymes. (New England BioLabs in Ipswich, 
Massachusetts, sells a kit of the necessary 
enzymes and buffers.) Assembled sequences 
can then be replicated in bacteria. 

Each assembled DNA piece starts and ends 
with the same sequences as the component 
parts, theoretically allowing larger and larger 
components to be assembled sequentially. But 
only three elements can be put together in a 
single reaction, which generally takes a cou- 
ple of days. Reactions are also less successful 
with longer molecules, discouraging long 
assemblies. 

Tom Knight, a computer scientist and co- 
founder of start-up company Gingko Bio- 
Works in Boston, invented BioBricks while 
working at MIT and has redesigned the sys- 
tem for industrial applications. The propri- 
etary version can assemble up to ten parts in 
a single reaction, says company co-founder 
Barry Canton. This allows researchers to work 
on DNA molecules with as many as 100,000 
base pairs, although most of the pathways 
that Gingko is working on are half that size. 
Just as importantly, most assembly steps can 
be performed by liquid-handling robots. For 
example, rather than bands of DNA being iso- 
lated from a gel, as in most methods, DNA 
molecules are collected onto and separated by 
suspended magnetic beads. Such automation 
speeds assembly and frees up lab technicians 
for more complicated tasks. 

But BioBricks-type methods are limited by 
their use of restriction enzymes. Because the 
enzymes cut DNA whenever they encounter a 
particular series of nucleotides, there are ‘for- 
bidden sequences’ that must be excluded from 
the genetic construct to avoid errant cutting. 
The larger a construct becomes, the harder it 
is to avoid such sequences. To circumvent this 
problem, researchers have developed assembly 
‘overlap’ methods, in which opposite ends of 
molecules are joined as DNA is copied. Dozens 
of separate pieces of DNA can be assembled in 
the same reaction, often totalling a few thou- 
sand nucleotides. These methods have their 
own drawbacks, however. Most copy DNA 
using the polymerase chain reaction (PCR), 
which can introduce errors. 

There is a bewildering array of overlap assem- 
bly techniques. ‘Gibson assembly, invented 
by Daniel Gibson and his colleagues at the 
Venter Institute, allows many sequences to 
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be assembled in parallel, and can even stitch 
together entire genomes’. In one demonstra- 
tion, the team started with six hundred ‘60- 
mers’ (oligonucleotides 60 base pairs long), and 
went on to assemble the 16.3-kilobase mouse 
mitochondrial genome’. 

Other methods include Golden Gate Shuf- 
fling, sequence- and ligation-independent 
cloning (SLIC), splicing by overlapping 
extension (SOEing), enzymatic inverse PCR 
(EIPCR), overlap extension and more. Some 
commercial kits are available: In-Fusion, from 
Clontech in Mountain View, California, has a 
mix of enzymes that can assemble 15-base- 
pair overlaps of any desired sequence. Life 
Technologies in Carlsbad, California, sells a 
plasmid-construction kit, MultiSite Gateway, 
that can join molecules with specific overlap 
sequences; it also markets the GeneArt High- 
Order Genetic Assembly System, which can 
assemble 10 DNA molecules, totalling up to 
110 kilobases. 

Researchers also design their own assembly 
reactions. To help this, the Joint BioEnergy 
Insttiute in Berkeley has invented a design 
tool, dubbed j5, that let researchers work with 
several DNA assembly protocols. It determines 
which overlap sequences to use, recommends 
the sequences to order from vendors and can 
instruct liquid-handling robots. Synthetic 
Genomics in La Jolla, California, which was 
co-founded by Venter, plans to start offering 
fee-for-assembly services later this year. 

Assemblies larger than about 100 kilobases 
may be best put together inside cells, because 
big DNA molecules are fragile and difficult 
to manipulate. In vitro replication is also less 
accurate than cells’ machinery. The Ven- 
ter Institute team managed to assemble a 
583-kilobase genome 
in vitro’, but it ulti- 
mately developed an 
in vivo assembly sys- 
tem for its synthetic 
genome. 

Larger genomes 
than that of M. 
mycoides have been 
assembled inside 


cells, albeit not from “It’s quite 
synthetic start- likely that . 
ing points. In 2005, transplantation 
Mitsuhiro Itaya, a willbe the 
biochemist now at unique step for 
Keio University in each species.” 
Tsuroka, Japan, and = Craig Venter 


his colleagues con- 
structed a 3,500-kilobase cyanobacterium 
genome’. They cut the genome of the bacte- 
rium Synechocystis PCC6803 into large chunks 
and propagated them in specially prepared 
plasmids in E. coli. The plasmids were then 
transferred into a third species, Bacillus subti- 
lis, where the DNA was stitched together. 
Assembly methods aren't interchangeable. 
Overlap sequences that work for one method 
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Bacterial cells carrying a synthetic genome can grow and divide like normal cells. 


often don’t work for others, so researchers who 
run into problems with one technique have to 
start from scratch, says Tom Ellis, a synthetic 
biologist at Imperial College London. Ellis is 
working with Geoff Baldwin, a biochemist also 
at Imperial, and other colleagues to develop 
rules to find out which sequences will work 
with multiple overlap techniques, including 
recombination in yeast and Bacillus. That way, 
if one technique doesn't work, researchers can 
try others quickly. 

These standards will also allow researchers 
to assemble DNA pieces in any order, says Ellis. 
Although a dictated order of assembly is fine 
for copying an existing genome, it does not let 
synthetic biologists test multiple possibilities. 
That issue is going to become more impor- 
tant as researchers move from working with 
thousands of base pairs to tens of thousands 
(see ‘Sizing up synthetic DNA). If researchers 
start building genomes or even large parts of 
genomes, they will have to think about how the 
DNA will wrap up on itself, and how they can 
place genes in chromosomes so that they end 
up in the right places, says Ellis. “It’s a whole 
other aspect we'll have to uncover if we're going 
to do genome engineering.” 


EDITING IS ESSENTIAL 

Jef Boeke, a molecular biologist at Johns Hop- 
kins Medical Institute in Baltimore, Mary- 
land, believes that genome-scale engineering 
is coming more quickly than many think. He 
is building artificial yeast chromosomes, each 
about the same size as the M. mycoides genome. 
Although he hasn't yet been able to design an 
entire new genome, he has developed tech- 
niques to make systematic alterations in exist- 
ing genetic codes. “It opens the door to a lot 
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of imaginative change at the genome scale that 
wasn't possible before,’ he says. For example, 
one systematic study in 2008 deleted introns 
(regions within genes that don't code for pro- 
tein) from many yeast genes individually, and 
found that the procedure had surprisingly little 
effect on the growth and fitness of cells’. Boeke 
wants to use his techniques to find out what 
will happen if all introns are removed from the 
genome at once. 

But new possibilities introduce new prob- 
lems. For the next few years, large genome 
assemblies are going to take months to build. 
With every assembly, researchers will detect 
unanticipated errors or realize after the fact 
that another sequence should work better, 
predicts Ellis. Then they will need to decide 
whether to assemble the whole genome again, 
or just edit it. “There has not been widespread 
acknowledgement in the synthetic-biology 
community that this is going to be an issue 
as we go into bigger assemblies,” he says. The 
problem has already made itself felt: a quota- 
tion that the Venter Institute had incorporated 
into its synthetic genome turned out to contain 
a mistake, and is going to be altered. 

Another use of editing is to produce and 
compare many gene variants. In a colourful 
demonstration in 2009, Church and his col- 
leagues described a high-throughput editing 
system. Multiplex-automated genome engi- 
neering (MAGE) mixes bacteria with syn- 
thesized stretches of DNA that are designed 
to target many areas in the genome; carefully 
timed jolts of electricity cause the bacteria 
to take up the DNA as they grow in culture. 
Church used MAGE to alter 24 genes in E. coli 
at once, focusing on those involved in making 
lycopene, an antioxidant and pigment found 
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GINGKO BIOWORKS 


SOURCE: P.A. CARR, & G.M. CHURCH. NATURE BIOTECHNOL. 27, 115-1162 (2009). 


UAH OREN SYNTHETIC GENOMES 


The useful genome 


In the short term, advances in genome 
assembly are not likely to matter much 
commercially, says Rob Carlson, a 
principal at the engineering, consulting 
and design company Biodesic in Seattle, 
Washington. Getting the genes right is 
only part of the challenge; finding the 
best conditions for growing microbes 
and extracting their products is just as 
important. “People will build things that 
are as complex as can be understood 
and commercialized,” says Carlson. 
Amyris Biotechnologies of Emeryville, 


California, is making biofuels by working Genetic assemblies are screened for desired effects. 


with about a dozen genes, not a million- 
base-pair genome. “It’s true that you can 
build something that size; it’s not true that 
you would do that in a circumstance where 
you have to worry about the bottom line,” 
says Carlson. “The problem is in getting stuff 
to work with the design and tinkering, rather 
than the assembly.” 

Amyris, Easel Biotechnologies in Costa 
Mesa, California, Gevo in Englewood, 


in tomatoes. Within three days, some bacte- 
rial cells were making five times more of the 
red stuff than cells in the starting population’. 
The need for custom equipment and the dif- 
ficulty of purifying transformed cells has kept 
researchers from widely adopting the tech- 
nique, but the sheer number of genetic possi- 
bilities that can be tested using MAGE isa huge 
advantage, says Church. As many as 4 billion 
E. coli genomes were produced in the course 
of one experiment. “Youre not resting on the 
outcome of one construct,” he adds. 
Mutagenesis and directed evolution of exist- 
ing genomes could also help synthetic biolo- 
gists to make up for current gaps in knowledge, 
says Collins. As more genes are brought into the 
system, he says, “uncertainty goes up exponen- 
tially, and you run up against the limits of what 
you can do modelling-wise”. And although 
computational approaches are not yet sophis- 
ticated enough to design new genomes, they are 


SIZING UP SYNTHETIC DNA 


Artificial DNA has grown from two-nucleotide 
molecules in 1955 to more than one million in 2010. 


100,000: 

10,000: 

1,000 « 
10 


Base pairs synthesized 


408 | NATURE | VOL 473 | 19 MAY 2011 


Colorado, Joule Unlimited in Cambridge, 
Massachusetts, and LS9 in South San 
Francisco, California, are making biofuels 
and petroleum extracts from yeast, blue- 
green algae, Escherichia coli and other 
microbes. These efforts are backed by 
high-profile scientists and venture capitalists. 
Synthetic Genomics in La Jolla, California, 
has a US$300-million partnership with 


good at modelling existing ones, he says. This 
understanding could help researchers to co-opt 
existing cellular networks to perform desirable 
tasks. “We are starting to see labs recognize that 
there is a lot to be exploited inside the cell,” says 
Collins (see “The useful genome’). 


BIOLOGY MATTERS 

The most difficult problem may well be one 
of the least discussed: putting the genome to 
work. Although Itaya has synthesized large 
genomes inside cells, the introduced genomes 
do not go on to produce proteins. Venter’s 
group had originally chosen Mycoplasma 
genitalium for the synthesis project because 
its genome was, at the time, the smallest 
known: only 583 kilobases. But M. genitalium 
grows so slowly that the team switched to its 
faster-growing cousin, even though its genome 
is twice the size. Making the DNA is not the 
rate-limiting step, says Venter. “It's much more 
dealing with the complexity of biology versus 
the chemical synthesis,” he says. 

In fact, Venter thinks that adapting genomes 
to work in different cell types may be one of 
the most difficult tasks. The creation of the 
first synthetic cell is illustrative: the team had 
to remove certain enzymes from recipient 
cells to keep them from cutting up the foreign 
DNA. And moving to other species is going 
to be even more difficult. Unlike Mycoplasma, 
many microbes contain tough cell walls that 
resist the introduction of DNA. “It’s quite likely 
that transplantation will be the unique step for 
each species,” says Venter. 
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ExxonMobil and plans to use engineered 
microbes to make clean water, fuel 

and vaccines. Amyris, which went 

public last year, is now valued at more 
than $1 billion. One company that 
folded, Codon Devices of Cambridge, 
Massachusetts, had focused not on 
making a product but on supplying 
services: providing synthetic genes to 
order and helping companies to develop 
synthetic-biology applications. 

Gingko BioWorks in Boston, 
Massachusetts, originally planned to 
offer services for large DNA assemblies, 
then decided to focus on engineering 
the microbes themselves. Business partners 
and investors are less interested in new ways 
to assemble DNA than in better ways to 
manufacture products, says Barry Canton, 
the company’s co-founder. “Companies 
may be shifting from a chemical synthesis 
platform to a biosynthesis platform,” he adds. 
Those who control the microbes control the 
means of production. Wi.8. 


Like a child learning to write, researchers 
must be able to copy natural genomes before 
they can create new ones. One day, geneticists 
will be able to design code on large scales, fuel- 
ling as-yet-undreamed-of applications, says 
Venter. “After we sequenced the genome, ana- 
lysts were arguing that there was no more need 
for sequencing, and I argued that this was 
the starting point.” The question of whether 
whole-genome synthesis will be useful will 
prove foolish in time, Venter believes. “It’s 
like asking, ‘why would you want to invent an 
airplane when people already have horses?” m= 


Monya Baker is technology editor for Nature 
and Nature Methods. 
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Forensic scientists must handle evidence, appear in court and understand the legal process. 


The call of the 


crime lab 


Forensic scientists can work in academia, government and 
the private sector, but the field is competitive. 


BY VIRGINIA GEWIN 


Cedric Neumann has witnessed first-hand his 
field’s scientific coming of age. “When I started 
my undergraduate degree, nobody wanted 
to work in forensic science; there was only a 
handful of programmes in the world,’ he says. 
Instead, “police officers were trained to work 
in a lab” But in the late 1990s, around the time 
that Neumann began his graduate studies at the 
University of Lausanne in Switzerland, which 
has the oldest forensic-science programme in 
the world, dozens of undergraduate and post- 
graduate programmes worldwide began to 
churn out forensic scientists. 

Two things have made forensics a more vis- 
ible — and fashionable — career choice, says 
Neumann. An increased focus on using DNA 


technologies to solve crimes has sparked a 
demand for properly trained biologists. And 
several television drama series that glorify 
forensic science have generated so much 
interest in the field in the past ten years that 
students have been flocking to study it. 

Forensic science encompasses a range 
of disciplines — including DNA analysis, 
examination of fingerprints or footwear 
impressions, and firearm analysis — used to 
solve criminal cases. In recent years the field 
has come under scrutiny amid calls for peer 
review to establish the reliability and accuracy 
of many forensic methods — and to develop 
them further. 

The discipline has been subject to market 
pressures. While Neumann was doing his PhD 
in his spare time, he also worked at Britain’s 
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Forensic Science Service (FSS), a government- 
owned company that provided the bulk of 
forensics services and research in the coun- 
try. But by last year the service was losing £2 
million (US$3.3 million) a month, and was 
deemed no longer cost-competitive against 
private-sector rivals who typically analyse 
only a few types of forensic evidence. The FSS 
is slated to close next March, making England 
and Wales the only parts of the world with a 
totally privatized forensic-science market. 

Neumann, now developing statistical tools to 
identify fingerprints at Pennsylvania State Uni- 
versity in University Park, is among the many 
forensic scientists who are concerned that 
privatization could compromise the quality of 
the science that has been achieved by services 
such as the FSS. For example, Neumann says, 
if the focus turns to only highly commercial 
products, such as DNA profiles, there may be 
less emphasis on analysing sources of evidence 
that require more specialized training, such as 
handwriting examination or tool marks, thus 
reducing the availability of the broad knowl- 
edge often needed to solve crimes. 

Still, crime isi’t going anywhere, and neither 
is the need for forensic scientists — if anything, 
demand for their skills has increased over time. 
Today, forensic scientists can find jobs in gov- 
ernment labs, private industry and, increas- 
ingly, academia. Yet a surplus of trainees, an 
economic downturn and ever-shifting politi- 
cal agendas make career prospects in this field 
difficult to predict. The mixture of scientific 
acumen and forensic training needed to carve 
out a successful sleuthing career will depend 
on an applicant's personal goals. 


UNSTEADY DEMAND 

Although the field is growing worldwide, that 
growth is uneven. “The demand for forensic 
scientists is good, but spotty,’ admits Jay Siegel, 
director of the forensic and investigative sci- 
ences programme at Indiana University—Pur- 
due University Indianapolis. 

For example, the UK forensics scene is in 
the middle of a mighty shake-up as it pre- 
pares for the FSS’s closure — a move that will 
result in 1,600 lay-offs. The turnover promises 
to open opportunities for new graduates as 
private companies recruit to fill the gap. “We 
anticipate that many people working at the 
FSS may choose not to relocate to other pro- 
viders,’ says Steve Allen, managing director 
of LGC Forensics, a private forensic-services 
company headquartered in Teddington, UK. 
So far, says Gillian Tully, head of research > 
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> and development at the FSS, some 90% of 
the 300 staff already laid off at the three shut- 
tered FSS sites in Birmingham, Chepstow and 
Chorley have left the profession completely; 
some retired and others didn’t want to move or 
were disgruntled by the privatization, she says. 
Allen hopes that LGC Forensics, Britain’s larg- 
est private forensics firm with 550 employees, 
will increase its staff by 50-100% as the FSS 
workload is divvied up. “We're out there right 
now actively recruiting at all levels,” he says. 

This short-term recruitment push, how- 
ever, might mask other long-term trends. 
Public-sector budgetary constraints continue 
to stymie recruitment worldwide. “We had a 
big growth phase a few years back. Now we are 
in a tight fiscal climate and I expect limited 
recruitment in the next two to three years,” 
says Gary Pugh, director of forensic services 
at New Scotland Yard in London, the head- 
quarters of the Metropolitan Police, who will 
rely solely on his in-house forensics team and 
private providers after the demise of the FSS. 
In the United States, economic conditions 
have grown so dire that cuts are being made 
at many of the 400 publicly funded crime labs 
that support local, state and federal branches of 
law enforcement. Alabama has been the most 
severely affected state, closing three crime labs 
to save money in the state budget. 

But if the economy is pushing recruitment 
down in some regions, politics and crime pat- 
terns can easily prompt hiring elsewhere. For 
example, New Jersey — where the governor, 
Chris Christie (Republican), is the former US 
attorney for the state, the chief federal law- 
enforcement officer — created 29 positions 
in the state Office of Forensic Sciences amid 
budget cuts. And, despite budget woes, admin- 
istrators in Los Angeles, California, are recruit- 
ing for the last of 26 DNA technician positions 


Jan De Kinder runs a Belgian institute that 
develops new forensic techniques. 
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WHAT APPLICANTS NEED TO KNOW 


The case for a forensics PhD 


Advantages 

@ A PhD is a must if you know you want 
to do forensics research at an academic 
institution. 


@ A doctorate can make you a more credible 
witness in court. 


@ Some laboratories, especially US federal 
labs, tend to promote people with PhDs 
more quickly and more willingly than 
those without, says Jay Siegel, director 

of the forensic and investigative sciences 
programme at Indiana University—-Purdue 
University Indianapolis. 


intended to reduce the city’s backlog of 6,000 
sexual-assault DNA-collection kits that need 
analysis, says Greg Matheson, director of the 
Los Angeles Police Department Crime Lab. 
Despite federal budget cuts, Vermont Senator 
Patrick Leahy (Democrat) introduced a federal 
billin January to boost forensic research needed 
to strengthen the quality of evidence routinely 
used in the criminal justice system. But Siegel 
doubts that the bill will pass in this climate. 


GETTING PROPER TRAINING 

The ‘CSI effect’ — the tremendous interest 
in the field aroused by the US television pro- 
gramme CSI: Crime Scene Investigation and 
similar dramas — has spurred a flood of appli- 
cations for forensic-science jobs. “If we have 
an opening for a forensic chemist, we can eas- 
ily get 200-300 applications for that position,” 
says Michael Medler, laboratory director of the 
forensic-services agency in Indianapolis, Indi- 
ana. The agency, an independent government 
entity, hires crime-scene specialists, forensic 
analysts and technicians. 

Asa result of the interest, employers have 
their pick of the talent, and increasingly choose 
applicants with postgraduate degrees. “In the 
past year, most forensic labs have become big- 
ger, and the scientific requirements for appli- 
cants continue to become more rigorous,’ says 
Jan De Kinder, director of the Belgian National 
Institute of Criminalistics and Criminology 
in Brussels, a research body within the justice 
department that conducts original research 
and develops new forensic techniques. He 
says that half of the 100 people working in his 
lab have MSc or PhD degrees — a worldwide 
trend, he adds. And whereas a PhD is not a 
requirement for many of the jobs in a foren- 
sic lab, a lack of one can affect aspirations (see 
“The case for a forensics PhD’). 

The scientific skills most in demand include 
DNA profiling and mass spectrometry for ana- 
lysing the chemistry of trace specimens, as well 
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Disadvantages 

@ There aren’t many university programmes 
that offer a PhD in forensic science — and 
those that exist are all in Europe. A more 
typical route to employment, especially in 
the United States, is to get a postgraduate 
degree in chemistry or biology, with a 
specialism in forensics. 


@ A PHD isn’t required for a typical 
forensics job in a crime lab — and it can 

be a deterrent to getting hired, because 
employers might be concerned that the 
candidate will get bored or leave for a job in 
academia, says Siegel. V.6. 


as statistical analysis to identify patterns found 
in other types of evidence, such as fingerprints. 
Siegel says that the biggest trend in academic 
forensics research at the moment is the attempt to 
validate the techniques used in such pattern-evi- 
dence analysis; a 2009 report by the US National 
Academy of Sciences highlights this need. 

Universities have seized on the grow- 
ing interest in forensics as a money-making 
opportunity. Hundreds of undergraduate and 
MSc forensic-science programmes now exist, 
but their quality varies widely. The American 
Academy of Forensic Sciences in Colorado 
Springs, Colorado — the field’s professional 
organization — established the Forensic Edu- 
cation Programs Accreditation Commission 
(FEPAC) in 2004 to accredit undergraduate 
and graduate degree programmes that meet 
minimum standards of excellence. 

Since then, 35 have been accredited, and that 
could grow to 50 within the next five to ten 
years, says Siegel. “The standards are quite rig- 
orous — the faculty and instruments needed 
to teach forensic science are pretty expensive,” 
he adds. 

Some employers tend to hire traditional 
chemistry or biology graduates rather than 
graduates of forensic-science programmes. 
“It’s really important to have a good scientific 
mindset and experience in biology or chem- 
istry; we can train them in the forensic part? 
says Allen. Positions at private companies, 
often further removed from the field’s judicial 
responsibilities, may not demand the forensics 
know-how required at public agencies. 

Chris Hassell, director of the US Federal 
Bureau of Investigation laboratory in Quan- 
tico, Virginia, echoes the importance of a sci- 
entific background. “We dont generally target 
people coming out of forensics disciplines; 
we're looking for good scientists,” says Hassell. 
He says that a candidate's scientific publication 
record can set her or him apart from the crowd 
— for example, ifit includes a paper addressing 


the validity of a specific forensic technique. 

Commitment is key. “Unlike other areas 
of science, in forensics a person's credibil- 
ity is called into question daily in a court of 
law,” says Medler. In addition to mastering 
a range of scientific techniques, he says, 
forensic scientists must be able to identify 
the most probative pieces of evidence at the 
crime scene, must know how to document 
who has physical possession of evidence 
and why, have knowledge of the legal pro- 
cess and have the ability to communicate on 
acourt stand. Much of that training must be 
acquired on the job. 

In Europe, the training requirements for 
crime-lab analysts vary depending on which 
body has authority over the forensics opera- 
tions. For example, in France, Italy and Spain, 
forensic services are provided by the police; 
until recently, only trained police officers 
could work in crime labs. However, in Bel- 
gium, forensic labs are under the purview of 
the justice department. 

Applicants with criminal records or who 
fail drug tests face dim prospects. Matheson 
says that background checks disqualify up to 
two out of every ten candidates. 


CLOSING THE GAP 

Forensic science is considered a young 
field. Police labs, frequently inundated with 
caseloads, are often simply unable to perform 
much-needed research. And although there 
is a growing amount of forensics research in 
academia, interactions between practitioners 
and researchers can be limited. 

But as the number of forensic-science 
programmes at universities grows, and the 
PhD and MSc students chip away at research 
needs, the field’s scientific footing is 
expanding. “The advantage of having more 
university training programmes in forensics 
is the increase in research activities,” says 
De Kinder. Unfortunately, researchers still 
struggle to find funding. 

“To better our profession we need to do 
two things: encourage people with PhDs to 
get into forensics and overcome the discon- 
nect between academia and the practising 
field” says Larry Quarino, chair of FEPAC 
and director of the forensic-science pro- 
gramme at Cedar Crest College in Allentown, 
Pennsylvania. He advocates the creation ofa 
sabbatical that would allow practising foren- 
sic scientists to conduct academic research 
necessary for their positions. 

“For a scientific discipline to be a living 
discipline, it needs to conduct research,” says 
Pierre Margot, head of the school of criminal 
justice at the University of Lausanne. “As long 
as researchers are working on the needs of 
tomorrow,’ says Margot, “I’m not too wor- 
ried about the state of the job market today.” 


Virginia Gewin is a science journalist based 
in Portland, Oregon. 


TURNING POINT 
Jill Venton 


Jill Venton, an analytical chemist 

at the University of Virginia in 
Charlottesville, received the 2011 Society 
for Electroanalytical Chemistry Young 
Investigator Award in March for her 
efforts to develop sensors able to probe 
neurotransmitters in fruitflies. 


As an analytical chemist, do you find 
neurochemistry messy? 

Analytical chemists develop methods to 
quantify the composition and structure of 
matter, and I definitely think like an analyti- 
cal chemist — I like precise measurements 
with small error bars. But life does not take 
place in a beaker, and I knew early in my 
career that I wanted to apply my skills to 
biology. I did my PhD in analytical chemis- 
try with a neuroscience focus and found that 
I liked the field, so I followed up my degree 
by doing a postdoc supervised jointly by a 
chemist and a neuroscientist. By compari- 
son with chemistry, neuroscience is messy. 
Its more exploratory, which often doesn't 
lend itself to nice, neat experiments, because 
we know so little about the brain — but it 
has been fun and challenging to use my 
talent for precision to help develop ways to 
measure brain functions. 


How do you get your research ideas? 

Some come from colleagues. For example, 
a neuroscience colleague wanted to meas- 
ure neurotransmitters in the fruitfly brain 
and challenged me to help him find a way 
to do it. [had never thought of it before, but 
I was exploring techniques to measure fast 
changes in neurotransmitters in the mam- 
malian brain, so I thought I could tackle it. 
Other ideas come from the need to keep 
pushing technology further and exploring 
the boundaries of what new methodology 
can tell us about neuroscience. 


What’s your strategy for winning early-career 
awards? 

I have applied for a lot of young-investi- 
gator awards, and certainly have not won 
them all. When I started out, I applied 
indiscriminately for any funding or award. 
I was lucky to get a US National Science 
Foundation career award early on, which 
helped to give my lab a foundation. Once 
I got that, I became pickier in terms of 
which awards to seek, because I didn’t have 
infinite amounts of time to apply to them. 
At the moment, I rely on national fund- 
ing agencies for my bread and butter, and 
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apply for awards that have a certain level of 
prestige to supplement that. 


You are awaiting a decision on tenure now. 
Was the tenure process what you expected? 

I knew that the tenure committee would 
look at grants and publications, and that 
there would be significant emphasis on let- 
ters written on my behalf from people out- 
side this institution. Many people do what’s 
called a ‘tenure tour’ in the year or so before 
they go up for tenure, working to raise their 
profiles and build a reputation in the field 
to ensure those positive tenure letters. I had 
a baby a year and a half before I went up for 
tenure, so my ability to travel was limited 
and I was more selective about where I went. 
For example, rather than presenting at sin- 
gle universities, I went to a Gordon Research 
Conference — an international gathering of 
scientists to discuss the frontiers of research. 
Before getting pregnant, I spent time net- 
working by meeting people at conferences 
and organizing workshops or symposia. 


Analytical chemistry is a male-dominated 
field. Does that pose challenges? 

Yes. I’m one of only three women in a depart- 
ment of about 30 — and the only woman 
with a child. But it is very typical in chemis- 
try for women to hold only 10% of the aca- 
demic positions. Still, this department has 
accommodated my efforts to set a flexible 
schedule to balance work and life. The big- 
gest challenge is that there weren't — and 
still aren't — many role models, successful 
female researchers. I had to look to biology 
and neuroscience for those. m 
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SURVEILLANCE 


BY JULIAN TANG 


CCr | he problem with running a 
country of 80 million people 
is that it’s difficult to know 

what people are thinking — I mean, 

really thinking,” said the Prime Minister, 
thoughtfully. 

Henry Irvin cleared his throat. “If I may 
make a suggestion, Sir?” he started, smooth- 
ing his tie and sitting up a little straighter. 
“We know that what people say in public, 
particularly when asked to express their 
views specifically, may not represent what 
they really think or feel. This is not neces- 
sarily a deliberate intent to lie or mislead, 
but, more often than not, it is an attempt to 
comply with their current peer-group beliefs 
or teachings — like being with your friends 
at school or your colleagues at work. Except 
for a few outstanding individuals, this seems 
to be the norm.” 

The PM listened intently. 

“However,” Henry continued, “when 
they believe that they are really anonymous, 
such as in Internet chat rooms, their real 
beliefs are often expressed — particularly in 
response to key questions. These might be 
about anything from the current state of the 
economy, their favourite football team, their 
friends, colleagues, et cetera, et cetera...” 

“So what are you proposing, Henry?” 
asked the PM, cautiously. 

“Well, if you really want to know what the 
people are thinking, Sir, you could set up 
your own Internet chat rooms to encourage 
individuals to express themselves, anony- 
mously, about various topics of specific 
interest to you and monitor their responses 
to key questions.” 

The PM sighed, disappointed. “This is 
nothing new Henry — this has been tried 
before and nothing really serious is ever dis- 
cussed by serious people in these chat rooms.” 

Henry paused for a few seconds before 
replying. “I've been talking to some people 
at GCHQ and they have an interesting idea. 
You've heard of genomics, proteomics and 
metabolomics, right, Sir?” 

The PM nodded. 

“Well, the smart guys there have come 
up with a new speciality, ‘grammaromics, 
consisting of verbomics, nounomics, adjec- 
tivomics and other subspecialities.” 

“Are you pulling my leg, Henry?” asked 
the PM, only half-jokingly. 

Henry shook his head. “Not at all, Sir. In 
fact, they've tried some of their algorithms 


414 | NATURE | VOL 473 | 19 MAY 2011 


The word on the street. 


on the text from 

some of these Internet 

chat rooms already. They’ve found that as 
long as these individuals stay online for 
a while and type a minimum number of 
diverse words and responses, they can reli- 
ably recognize any particular individual by 
the way they use their English constructs. Of 
course, they cannot identify the individuals 
themselves by this method alone, but...” 

Henry paused again, for he knew that 
what he was about to say would not make 
the PM particularly happy. 

“.. well, they hacked into a limited num- 
ber of e-mail servers to see if they could 
identify these chat room users by matching 
these ‘grammaromes to any particular e-mail 
text — just like trying to match fingerprints 
or DNA sequences. They then back-traced 
these individuals’ IP addresses from their 
chat rooms and their e-mails to see if they 
were one and the same individual” 

“You mean that GCHQ invaded their pri- 
vacy to test ahypothesis?” the PM demanded, 
severely. Then, more curiously, he asked: “Did 
it work?” 

Henry breathed a sigh of relief. “Yes 
Sir. Remarkably well, in fact. It seems that 
individuals develop their own unique way 
of expressing themselves in writing that 
remains more or less unchanged for life — 
just like their fingerprints or DNA” 

“That’s amazing!” exclaimed the PM, 
genuinely surprised. “If we know that peo- 
ple express themselves more truly in Internet 
chat rooms and on e-mail, when they think 
they're anonymous, and if we can monitor 

these ‘genuine’ com- 


> NATURE.COM munications then... 
Follow Futures on Henry, if this works, you 
Facebook at: truly are a genius!” 

go.nature.com/mtoodm Over the next few 
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years of his first term, the PM’s popular- 
ity soared. His policies were adopted in 
record time and it seemed that he could do 
nothing wrong. Even the typically cynical 
British media were lost for words. 

Then one morning, there was a soft 
knock on his office door. “Come in 
Henry!” he yelled cheerfully, waving his 
pen. 

Henry entered, looking concerned. “Sir,” 
he began without ceremony. “My colleagues 
at GCHQ have noticed some unusual 
patterns of Internet chat room and corre- 
sponding e-mail activity.” 

The PM motioned Henry to the seat in 
front of him. “What is it, Henry?” he asked, 
now serious. 

“We think that someone, possibly via a 
leak from either GCHQ or this office, may 
have got wind of our clandestine Internet 
public-opinion surveillance strategy.” 

“Why would they think that, Henry?” 
retorted the PM, impatiently. “How would 
anyone outside our inner circle know or sus- 
pect anything? There's been no publication 
in that popular science journal — what's its 
name — Nature, yet has there?” 

“Not as far as I know, but that doesn’t 
mean that no one has figured this out by 
themselves. After all, you are doing remarka- 
bly well in the opinion polls, Sir — just about 
the most popular PM in British history.’ 

“So, what evidence do they have to make 
them think that anyone out there is trying 
to manipulate this Internet surveillance sys- 
tem?” 

“Well, it seems that an online consensus is 
building that we should consider abolishing 
VAT and income tax, as well as providing a 
Jaguar or Aston Martin to everyone passing 
their driving test, starting from the next tax 
year...” 

The PM digested this information for a 
moment then sat back in his chair, chuck- 
ling. “I guess it was too good to last, eh, 
Henry?” 

Henry allowed himself a rare smile. “So, 
back to business as usual then, Sir?” 

“You read my mind, Henry,” said the 
PM, still chuckling and picking up his pen 
again as Henry closed the office door quietly 
behind him. = 


Julian Tang is a clinical/academic virologist 
who has had several stories in Futures. Some 
that didn't make the final cut can be found 
in an anthology, soon to be available at 
amazon.com. 
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Neural crest regulates myogenesis through the 
transient activation of NOTCH 


Anne C. Rios!, Olivier Serralbo!*, David Salgado!* & Christophe Marcelle! 


How dynamic signalling and extensive tissue rearrangements 
interact to generate complex patterns and shapes during embryo- 
genesis is poorly understood’ *. Here we characterize the signalling 
events taking place during early morphogenesis of chick skeletal 
muscles. We show that muscle progenitors present in somites 
require the transient activation of NOTCH signalling to undergo 
terminal differentiation. The NOTCH ligand Deltal is expressed 
in a mosaic pattern in neural crest cells that migrate past the 
somites. Gain and loss of Deltal function in neural crest modifies 
NOTCH signalling in somites, which results in delayed or pre- 
mature myogenesis. Our results indicate that the neural crest 
regulates early muscle formation by a unique mechanism that 
relies on the migration of Deltal-expressing neural crest cells to 
trigger the transient activation of NOTCH signalling in selected 
muscle progenitors. This dynamic signalling guarantees a balanced 
and progressive differentiation of the muscle progenitor pool. 
Early skeletal muscle (the primary myotome, which is composed of 
mononucleated post-mitotic muscle fibres, or myocytes) is formed 
from the generation of muscle cells at the four borders of the dermo- 
myotome, the dorsal-most epithelial compartment of somites*"®. 
Most of the dermomyotome undergoes an epithelial to mesenchymal 
transition that leads to the emergence of a population of resident 
muscle progenitors that massively contributes to the growth of all 
trunk muscles'’""*. The medial border of the dermomyotome, the 
dorsomedial lip (DML), remains epithelial for a considerable period 
of time, during which it generates muscle cells that contribute to the 
growth of the primary myotome. DML stem/progenitor cells can 
adopt two fates during the first days of embryonic muscle develop- 
ment**: to self-renew and remain in the epithelial border of the der- 
momyotome or to translocate in the myotome and undergo terminal 
myogenic differentiation. How this balance is regulated is unknown. 
In the chick embryo, the epithelial DML population comprises a 
majority (77%) of PAX7-positive cells interspersed by (23%) PAX7/ 
MYF5-positive cells (Supplementary Figs 1a-e, 2a-e). In the transition 
zone, cells shut-off the expression of PAX7, but maintain MYF5 
expression. Fully elongated myocytes express skeletal muscle myosin 
heavy chain (MyHC; also known as MYC). NOTCH family members 
are expressed in the DML, the transition zone and the myotome during 
the first phase of myogenesis (Fig. 1a)'°. The NOTCH target genes 
HES1/cHairy2 and lunatic fringe (LFNG) are expressed in a salt-and- 
pepper pattern within the DML. Many transition zone cells express 
HES1, whereas LFNG expression is low in this region. Their expression 
is low in the myotome (Fig. la, b and Supplementary Fig. 2f). Both 
genes act as bona fide NOTCH targets in somites, as their messenger 
RNA expression is upregulated after electroporation of a constitutive 
form of NOTCH (NOTCH intracellular domain (NICD); Supplemen- 
tary Fig. 2g-i, m-o). To quantify the distribution of NOTCH activity, 
we electroporated a NOTCH reporter construct consisting of the 
mouse Hes! promoter region upstream of a destabilized GFP 
(d2EGFP; half life, 2 h), that efficiently responds to NOTCH activation 
and inhibition (Supplementary Fig. 2j-l). We co-electroporated a 


human histone H2B (H2B)-RFP reporter gene driven by an ubiquitous 
promoter to evaluate the normal distribution of electroporated cells. 
After 24h, 11% of H2B-RFP-positive cells were HES1-d2EGFP- 
positive (Fig. 1c, d and Supplementary Fig. 21). The H2B-RFP-positive 
cells were distributed among PAX7- (60%), MYF5- (46%) and MyHC- 
positive (10%) populations. In contrast, nearly all (92%) HES1-positive 
cells were MYF5-positive (distributed in the DML and the transition 
zone), whereas only 24% and 2% expressed PAX7 and MyHC, respec- 
tively (Fig. le-i and Supplementary Fig. 3a, b). We followed the mor- 
phogenic movements of NOTCH-activating cells using live video 
microscopy. Epithelial cells that activated the NOTCH reporter in 
the DML rapidly translocated in the transition zone (Fig. 1j-] and 
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Figure 1 | Notch is active during early myogenesis. a, Scheme showing the 
expression of NOTCH signalling family members in the DML, transition zone 
(TZ) and myotome. Serrate2 is also known as JAGGED2. b, Expression of chick 
HES1/chick Hairy2 in the DML and the TZ. ISH, in situ hybridization. 

c-h, Confocal stacks showing the expression (in green) of a HES1-d2EGFP 
reporter construct and (in red) RFP (c, d), PAX7 (e, f) and MYF5 (g, h) in dorsal 
(d, f, h) and transverse (c, e, g) views of somites 24h after electroporation. 

i, Quantification of c-h. Error bars show standard deviation (s.d.). 

**P < 0.0001. j-l, Time-lapse confocal analysis (see Supplementary Movie 1) 
showing the translocation of two NOTCH-activating DML cells (blue and 
white arrowheads) into the transition zone. TO, start of the movie; T1, 

4h 50 min after the start; T2, 10 h after the start. My, myotome; NT, neural tube. 
Scale bars, 50 tum. 
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Supplementary Movie 1). We followed their fate as they further differ- 
entiated, using a construct that contains the Hes1 promoter region 
upstream of a stable GFP (EGFP half life of over 24h). After 24 and 
48 h, the proportion of MyHC-positive myocytes was more than twice 
(24h: 34%; 48h: 60%) that of control RFP-electroporated embryos 
(24h: 13%; 48h: 28%; Supplementary Fig. 3c-g), further indicating 
that activation of NOTCH signalling is associated with myogenesis. 

Altogether, this indicates that NOTCH signalling is activated in 
DML cells that engage in the myogenic program before they translo- 
cate into the transition zone. NOTCH signalling remains active in the 
transition zone and is extinguished before cells undergo terminal myo- 
genic differentiation and elongate into myocytes. 

We inhibited NOTCH activity in somites using a truncated, 
dominant-negative form of the NOTCH transcriptional co-activator 
mastermind (DN MAML1)'°”’ and small interfering RNAs (siRNAs) 
against NOTCH] (ref. 18). DN MAML1 and the NOTCH] siRNA gave 
similar results one day later, that is, a drastic reduction of myogenic 
differentiation (Fig. 2). This was characterized by a sharp reduction of 
MYF5-positive cells (7% DN MAMLI ;3% siRNA NOTCH), compared 
to controls (49% CAGGS-IRES-EGFP ; 53% siRNA luciferase), and by 
a halt of terminal differentiation (0% MyHC-positive cells for DN 
MAMLI and siRNA NOTCH; controls: 12% CAGGS-IRES-EGFP; 
8% siRNA luciferase; Fig. 2s, t), with no change in dermomyotome cell 
proliferation (Supplementary Fig. 4a—-d). Virtually all cells in which 
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Figure 2 | NOTCH signalling is necessary for myogenesis. a-r, Confocal 
stacks of somites in dorsal (b, d, f, h, j, 1, n, p, r) and transverse 

(a, ¢, e, g, i, k, m, 0, q) view, 24h after electroporation of (in green) CAGGS- 
EGFP as controls (a-f), DN MAMLI (g-I) and siRNA against chick NOTCH1 
(m-r), and stained (in red) for PAX7 (a, b, g, h, m, n), MYF5 (c, d, i, j, 0, p) and 
MyHC (e, f, k, 1, q, r). s, Quantification of a-L t, Quantification of m-r, siRNA 
against luciferase as controls are not shown. Error bars show s.d. 

*** P< 0),0001. Scale bars, 50 tm. 
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NOTCH signalling was inhibited remained epithelial in the dermo- 
myotome (98% DN MAMLI; 97% siRNA NOTCH; controls: 56% 
CAGGS-IRES-EGFP; 57% siRNA luciferase). Altogether, this strongly 
indicates that NOTCH signalling is necessary for the initial phases of 
myogenesis of DML cells. 

We induced a gain of NOTCH function by electroporating NICD in 
newly formed somites. One day later, most electroporated cells had 
translocated in the transition zone (Fig. 3a-f); however, most (83%) 
expressed the dermomyotomal marker PAX7 (Fig. 3a, b, g), and only 
a few (3%) were MYF5 positive (Fig. 3c, d, g). Although few electro- 
porated cells entered the myotome region, they did not elongate and 
never (0%) initiated MyHC expression (Fig. 3e-g). This result is coher- 
ent with studies that showed that NOTCH signalling inhibits muscle 
differentiation in various contexts’? *'. However, characterizing the 
electroporated cells 6h after electroporation of NICD, we observed a 
robust increase in the proportion of electroporated cells expressing 
MYF5 (80%; controls, 17%; Fig. 3h-k and Supplementary Fig. 5n). 
Strong MYF5 activation was maintained 12h after electroporation 
(89%; controls, 20%; Fig. 31-o and Supplementary Fig. 5n). After 6h, 
all MYF5-positive electroporated cells were positioned in the epithelial 
DML (Fig. 3j), at 12 h, most electroporated cells had translocated in the 
transition zone (Fig. 3n). The same observations were made with 
MYOD (Supplementary Fig. 5a—m). Altogether, this indicates that the 
first steps of myogenesis (the activation of MYF5 and MYOD) are 
promoted by a short activation of NOTCH signalling. However, a sus- 
tained activation of NOTCH reverses the myogenic program, resulting 
in a downregulation of MYF5 and MYOD expression and a return to a 
PAX7-positive state. 

To prove this, we used a doxycyclin-inducible system to drive NICD 
expression. In the continuous presence of doxycyclin, NICD expres- 
sion was maintained in electroporated cells and, consistent with our 
previous observation (Fig. 3a-f), most of them translocated in the 
transition zone but did not maintain MYF5 expression (6%, Fig. 3r, 
s, V; controls, 42% MYF5-positive, Fig. 3p, q, v). When doxycyclin was 
removed, NICD was strongly expressed 6h later, but was almost un- 
detectable after overnight incubation (Supplementary Fig. 6c, d, f). 
Remarkably, after this transient activation of NOTCH signalling, most 
electroporated cells had translocated in the transition zone and the 
myotome and nearly all (97%) expressed MYF5 (Fig. 3t-v). In addi- 
tion, electroporated cells that were positioned in the myotome had 
elongated into myocytes, indicating that they initiated terminal differ- 
entiation. The lack of electroporated cells in the DML (arrowheads in 
Fig. 3t) indicates a depletion of the DML progenitor cell population 
and suggests that the pulse of NOTCH signalling massively disrupted 
the balance between maintenance and differentiation of this cell popu- 
lation. This shows that NOTCH signalling displays a complex beha- 
viour on myogenesis, acting as a potent stimulator of the myogenic 
program for DML cells, but only during a limited time window. 

In search for a signal controlling the mosaic activation of NOTCH 
we observed in the DML, we noted that neural crest cells that migrate in 
close proximity to the DML express the NOTCH ligand Deltal (DLL1) 
in a salt-and-pepper pattern (Fig. 4a and Supplementary Fig. 8a, b). A 
provocative hypothesis was thus that migrating DLL1-expressing 
neural crest cells may activate NOTCH signalling in selected progeni- 
tors within the DML. We eliminated the neural crest cell population by 
electroporating into the neural tube a diphtheria toxin fragment A 
complementary DNA (DTA) under the control of a neural-crest- 
specific promoter (Supplementary Fig. 7a-f). This led to a considerable 
decrease in the expression of MYF5 on the electroporated side (Fig. 4b, 
arrowheads, and Supplementary Fig. 8c; nm = 13/15). The inhibition of 
non-canonical, planar cell polarity (PCP) WNT signalling in Xenopus 
affects neural crest migration” without affecting its induction. In the 
dorsal neural tube, we electroporated a mutant form of the WNT 
intracellular effector Dishevelled that specifically inhibits the WNT/ 
PCP pathway”. This led to a considerable reduction in MYF5 
expression compared to the control side (Fig. 4c and Supplementary 
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Figure 3 | Myogenesis requires the transient activation of NOTCH. 

a-f, Confocal stacks of somites in dorsal (b, d, f) and transverse (a, c, e) view 
24h after electroporation of NICD (in green), and stained (in red) for PAX7 
(a, b), MYF5 (c, d) and MyHC (e, f). g, Quantification of af. Error bars show 
s.d. ***P < 0.0001. h-o, Time-course analysis of MYF5 expression (in green) 
in dorsal (i, k, m, 0) and transverse (h, j, 1, n) view, 6 h (h-k) and 12 h (1-o) after 
electroporation of CAGGS-H2B-RFP as control (h, i, l, m) or HA-NICD 
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(j, k, n, 0). In red, staining for RFP (h, i, 1, m) or HA (j, k, n, 0). p-u, Confocal 
stacks of somites in dorsal (q, s, u) and transverse (p, r, t) view electroporated 
with a doxycyclin-expression-inducible system. Embryos were electroporated 
with an empty vector as controls (p, q) or HA-NICD (r, u), treated for 1h 
(t, u) or overnight (p, s) with doxycyclin and stained for GFP (in green) and 
MYFS5 (in red). v, Quantification of p—u. Scale bars, 50 um. 


Fig. 8d; n = 10/13). We then electroporated DLL1 under the control of 
the neural-crest-specific promoter in the neural tube and verified that 
this resulted in the overexpression of DLL] protein in neural crest cells 
(Supplementary Fig. 9a-c). We observed a significant increase of 
MYFS5 expression (Fig. 4d and Supplementary Fig. 8e; n = 9/13). 
Loss of DLL1 function was achieved by electroporating a dominant- 
negative form of DLL1 (DN DLL1)” and with siRNAs against chick 
DLL1. DN DLL1 protein was expressed in neural crest cells 
(Supplementary Fig. 9k-m) and the siRNA construct efficiently 
reduced the endogenous DLL1 mRNA (Supplementary Fig. 9n-p) 
and protein (Supplementary Fig. 9q-s) levels. Both the DN DLL1 
(n = 17/17) and the DLL1 siRNA (n = 10/11) resulted in a significant 
reduction in MYF5 staining (Fig. 4e, f and Supplementary Fig. 8f, g). 
Overexpression of DLL] in neural crest resulted in a robust activation 
of chick HES1 mRNA expression (Supplementary Fig. 9d-f) and of the 
NOTCH reporter activity in somites (67%; Supplementary Fig. 9i, j), 
whereas electroporation of DTA or DN DLLI led to a near loss of 
NOTCH reporter activity (1.6% and 1.8%, respectively, Supplemen- 
tary Fig. 9g, h, j; controls: 11%, Fig. 1c, d; Supplementary Fig. 21), 
strongly supporting the hypothesis that NOTCH ligands presented 


Figure 4 | Neural crest regulates myogenesis in somites through NOTCH 
signalling. a, Mosaic expression of chick DLL1 (in blue, yellow arrowheads) 
within the HNK1-positive (in red) neural crest population. b-f, Confocal stacks 
of neural tube, neural crest and somites in dorsal view, 24h after 
electroporation of one half of the neural tube with U2-DTA (b), CAGGS- 
DVLADEP (c), U3-DLL1 (d), U2-DN DLLI (e) (see Methods for details) and 
siRNA against chick DLL1 (f). In green (f) native RFP; in green (b-e) GFP 
immunostaining; in red, MYF5 and in blue, HNK1. Dotted lines in b-f indicate 
the level of transverse sections shown in Supplementary Fig. 8c-g. Scale bars, 
50 um. 
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by neural crest cells modulate NOTCH signalling in somites. When 
DTA, DLL1 or DN DLL] was electroporated into neural crest, MYOD 
expression was affected similarly to MYF5 (Supplementary Fig. 10), 
indicating that the two major molecular players of the early myogenic 
program are similarly regulated by DLL1 from neural crest. The pro- 
liferation of progenitors within the DML was not significantly changed 
in these experiments (Supplementary Fig. 11a-l), further supporting 
the hypothesis (Fig. 3t, u) that NOTCH signalling regulates the pro- 
gressive differentiation of the muscle progenitor pool within the DML. 
Because neural crest emigrates from the neural tube during a limited 
time period (about 24h from when migration initiates), the neural- 
crest-mediated regulation of muscle growth is limited to the initial 
phases of myotome formation. However, this may have long-term 
consequences on muscle growth, as we observed significant changes 
in myotome growth 48h after electroporation of DTA, DLL1 or DN 
DLL] into the neural crest, that is, 24 h after crest migration has ceased 
(Supplementary Fig. 12a-s). As controls, we verified that the neural 
crest manipulations did not affect the expression of the known modu- 
lators of myotome formation in the dorsal neural tube, that is, WNT1, 
WNT3A and BMP4 (Supplementary Fig. 13a-l). It is unclear whether 
the same regulatory mechanisms are used in mouse. Hypomorph DIl1 
mouse mutants displayed an enlarged primary myotome’’. However, 
as Dll] is expressed in both paraxial mesoderm and neural crest in early 
mouse embryo, the source of Notch signalling that engenders this 
phenotype remains to be defined. To examine this question, the inhibi- 
tion of DIl1 activity in specific cell types of the mouse embryo will be 
required. 

Our model suggests an additional role of the NOTCH pathway 
during myogenesis whereby, within a population of DML cells all 
exposed to uniform gradients of myogenic activating factors, only those 
cells that transiently activate the NOTCH pathway undergo myogenesis. 
Transient NOTCH signalling is triggered by the NOTCH ligand DLL1 
carried and presented by migrating neural crest cells in a ‘kiss and run’ 
mode of signalling transduction (Supplementary Movie 2). This links 
the timing of myotome formation to that of neural crest migration, 
providing a mechanistic link for the concurrence of these two events 
(Supplementary Fig. 14a-g). The ability of migrating cells to influence 
cell fate in neighbouring tissues may reveal a general principle for 
generating pulses of signal activation that result in the differentiation 
of a defined subset of cells within a stem or progenitor pool. 


METHODS SUMMARY 


Electroporation, vectors, time-lapse experiments and confocal analyses. 
Further details can be found in Methods. The somite electroporation technique 
has been described elsewhere®”’. Time-lapse experiments were performed essen- 
tially as described’ on transverse slices of embryos. 

Quantifications and statistical analyses. On average, more than 2,300 cells were 
counted per point to compute the corresponding quantifications shown in Figs 1-3 
and Supplementary Figs 2-6. Statistical analyses were performed using the 
GraphPad Prism software. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 


Received 6 September 2010; accepted 24 February 2011. 
Published online 15 May 2011. 


1. Shimojo, H., Ohtsuka, T. & Kageyama, R. Oscillations in Notch signaling regulate 
maintenance of neural progenitors. Neuron 58, 52-64 (2008). 

2. Joubin, K. & Stern, C. D. Molecular interactions continuously define the organizer 
during the cell movements of gastrulation. Cel/ 98, 559-571 (1999). 

3. Palmeirim, |., Henrique, D., Ish-Horowicz, D. & Pourquie, O. Avian hairy gene 
expression identifies a molecular clock linked to vertebrate segmentation and 
somitogenesis. Cel! 91, 639-648 (1997). 


4 | NATURE | VOL 000 | 00 MONTH 2011 


4. Denetclaw, W. F. Jr, Berdougo, E., Venters, S. J. & Ordahl, C. P. Morphogenetic cell 
movements in the middle region of the dermomyotome dorsomedial lip 
associated with patterning and growth of the primary epaxial myotome. 
Development 128, 1745-1755 (2001). 

5. Venters, S. J. & Ordahl, C. P. Persistent myogenic capacity of the dermomyotome 
dorsomedial lip and restriction of myogenic competence. Development 129, 
3873-3885 (2002). 

6. Gros, J., Scaal, M. & Marcelle, C. A two-step mechanism for myotome formation in 
chick. Dev. Cell 6, 875-882 (2004). 

7. Kahane, N., Cinnamon, Y. & Kalcheim, C. The cellular mechanism by which the 
dermomyotome contributes to the second wave of myotome development. 
Development 125, 4259-4271 (1998). 

8. Kahane, N., Cinnamon, Y. & Kalcheim, C. The roles of cell migration and myofiber 
intercalation in patterning formation of the postmitotic myotome. Development 
129, 2675-2687 (2002). 

9. Cinnamon, Y., Kahane, N. & Kalcheim, C. Characterization of the early development 
of specific hypaxial muscles from the ventrolateral myotome. Development 126, 
4305-4315 (1999). 

0. Kahane, N., Cinnamon, Y., Bachelet, |. & Kalcheim, C. The third wave of myotome 
colonization by mitotically competent progenitors: regulating the balance 
between differentiation and proliferation during muscle development. 
Development 128, 2187-2198 (2001). 

1. Ben-Yair, R. & Kalcheim, C. Lineage analysis of the avian dermomyotome sheet 
reveals the existence of single cells with both dermal and muscle progenitor fates. 
Development 132, 689-701 (2005). 

2. Gros, J., Manceau, M., Thome, V. & Marcelle, C. A common somitic origin for 
embryonic muscle progenitors and satellite cells. Nature 435, 954-958 (2005) 
CrossRef. 

3. Relaix, F., Rocancourt, D., Mansouri, A. & Buckingham, M. A Pax3/Pax7-dependent 
population of skeletal muscle progenitor cells. Nature 435, 948-953 (2005). 

4. Kassar-Duchossoy, L. et al. Pax3/Pax7 mark a novel population of primitive 
myogenic cells during development. Genes Dev. 19, 1426-1431 (2005). 

5. Hirsinger, E. et al. Notch signalling acts in postmitotic avian myogenic cells to 
control MyoD activation. Development 128, 107-116 (2001). 

6. Fryer,C.J., Lamar, E., Turbachova, |., Kintner, C. & Jones, K.A. Mastermind mediates 
chromatin-specific transcription and turnover of the Notch enhancer complex. 
Genes Dev. 16, 1397-1411 (2002). 

7. Weng, A. P. etal. Growth suppression of pre-T acute lymphoblastic leukemia cells 
by inhibition of notch signaling. Mol. Cell. Biol. 23, 655-664 (2003). 

8. Das, R. M. et a/. A robust system for RNA interference in the chicken using a 
modified microRNA operon. Dev. Biol. 294, 554-563 (2006). 

19. Vasyutina, E., Lenhard, D. C. & Birchmeier, C. Notch function in myogenesis. Cell 

Cycle 6, 1450-1453 (2007). 

20. Schuster-Gossler, K., Cordes, R. & Gossler, A. Premature myogenic differentiation 
and depletion of progenitor cells cause severe muscle hypotrophy in Delta 
mutants. Proc. Natl Acad. Sci. USA 104, 537-542 (2007). 

21. Vasyutina, E. et al. RBP-J (Rbpsuh) is essential to maintain muscle progenitor cells 
and to generate satellite cells. Proc. Nat! Acad. Sci. USA 104, 4443-4448 (2007). 

22. De Calisto, J., Araya, C., Marchant, L., Riaz, C. F. & Mayor, R. Essential role of non- 
canonical Wnt signalling in neural crest migration. Development 132, 2587-2597 
(2005). 

23. Wallingford, J. B. et al. Dishevelled controls cell polarity during Xenopus 
gastrulation. Nature 405, 81-85 (2000). 

24. Rothbacher, U. et a/. Dishevelled phosphorylation, subcellular localization and 
multimerization regulate its role in early embryogenesis. EMBO J. 19, 1010-1022 
(2000). 

25. Gros, J., Serralbo, O. & Marcelle, C. WNT11 acts as a directional cue to organize the 
elongation of early muscle fibres. Nature 457, 589-593 (2009). 

26. Henrique, D. et a/. Expression of a Delta homologue in prospective neurons in the 
chick. Nature 375, 787-790 (1995). 

27. Rios, A.C., Denans, N. & Marcelle, C. Real-time observation of Wnt B-catenin 
signaling in the chick embryo. Dev. Dyn. 239, 346-353 (2010). 


Supplementary Information is linked to the online version of the paper at 
www.nature.com/nature. 


Acknowledgements We thank N. Rosenthal and P. Currie for critical reading of the 
manuscript. This study was funded by grants from the Agence Nationale pour le 
Recherche (ANR), and by the EU 6th Framework Programme Network of Excellence 
MYORES. The help of P. Weber, S. Firth, C. Johnson and |. Harper from Imaging Facilities 
(IBDML, Marseille and MMI, Monash University) is acknowledged. 


Author Contributions A.C.R. and C.M conceived the experiments. A.C.R. predominantly 
performed the work with the help of O.S. D.S. designed the animation. C.M. supervised 
the project and wrote the paper. 


Author Information Reprints and permissions information is available at 
www.nature.com/reprints. The authors declare no competing financial interests. 
Readers are welcome to comment on the online version of this article at 
www.nature.com/nature. Correspondence and requests for materials should be 
addressed to C.M. (christophe.marcelle@monash.edu). 


©2011 Macmillan Publishers Limited. All rights reserved 


METHODS 


Electroporation and confocal analysis. The somite electroporation technique 
that was used throughout this study has been described elsewhere*”’. Briefly, we 
targeted the expression of various constructs to the dorsomedial portion of newly 
formed interlimb somites of Hamburger-Hamilton (HH) stage 15-16 chick 
embryos (24-28 somite)**. We have previously shown that this technique allows 
the specific expression of cDNA constructs in the DML of the dermomyotome’®. 
To target the neural crest population, we electroporated the dorsal neural tube of 
HH stage 13-14 chick embryos at the level of the presomitic mesoderm. 

The following constructs have been previously published: HES1-d2EGFP and 
the HES1-EGFP” (provided by R. Kageyama) contain the mouse Hes! promoter 
region upstream of destabilized or stable GFP. The CAGGS-H2B-RFP (provided 
by S. Tajbakhsh) contains a fusion of histone 2B with RFP downstream of the 
CAGGS strong ubiquitous promoter (CMV/chick B-actin promoter/enhancer). 
The CAGGS-EGFP” contains the CAGGS promoter followed by the EGFP 
reporter gene. The pCAB-HA-NICD-IRES-GFP (provided by N. Daudet) contains 
an HA-tagged NICD under the control of the CAGGS promoter”". The doxycyclin 
inducible system is composed of two plasmids that are co-electroporated: first, the 
pCIRX-rtTA-IRES-DsRed* (provided by O. Pourquié) contains a Tet-On 
Advanced transactivator (rtT'A, Clontech) downstream of the CAGGS promoter. 
The IRES-DsRed-Express allows the detection of electroporated cells. Second, the 
pBI-HANICD-EGEFP is the response plasmid (Clontech) in which the HA-tagged 
constitutively active form of NOTCH, NICD, was cloned. The bidirectional 
tetracyclin-response element drives the expressions of EGFP (which serves as an 
internal control of the induced response, see Supplementary Fig. 2a, b) and 
HANICD. pCLGFP-DVLADEP contains a mutated form of Xenopus 
Dishevelled that lacks the DEP domain, driven by the CAGGS promoter”. This 
construct contains also EGFP driven by its own SV40 promoter. The siRNA 
directed against chick NOTCH1 has been described elsewhere'® 

We made new constructs for this study: to construct the HES] nVENUS-PEST, 
a destabilized nuclear Venus GFP variant*’ was inserted downstream of the mouse 
Hes1 promoter region’. The CAGGS-DN MAMLI-EGFP contains a truncated, 
dominant-negative form of the human Mastermind (DN MAML}), fused with 
EGFP” downstream of the CAGGS promoter. The pCAB-HA-NICD was con- 
structed by removing the EGFP reporter from pCAB-HA-NICD-IRES-GEP. The 
U2- and U3-EGFP were made by inserting the U2 and U3 evolutionary conserved 
Sox10 enhancer sequences™ in the TK-EGFP”? plasmid, that contains the thymi- 
dine kinase minimal promoter upstream of the EGFP. The diphtheria toxin gene”®, 
the chick DLL1 or a dominant-negative form of this gene*’ were inserted in the U2 
or the U3-TK-EGFP in place of the EGFP to obtain the U2-DTA, the U2- 
DN DLLI and the U3-DLLI electroporation vectors. To detect electroporated 
cells, those plasmids were electroporated with a pCAGGS-EGFP. We have con- 
structed two RNA interference plasmids as described previously’* that each 
express two siRNAs directed against chick DLL1. Sequences TCACAGCGATA 
ACTCCGATAAA and TGCAGGAGTTTGTCAACAAGAA were inserted in 
siRNA chick DLL1 A, whereas sequences GATTCAGTATATTCCACTTCAA 
and CCGGCACCTTCTCGCTCATCAT were inserted in siRNA chick DLL1 B. 
Electroporation of plasmids A, B, or A together with B efficiently decreased the 
endogenous expression of chick DLL1 mRNA and protein, whereas the electro- 
poration of siRNA directed against luciferase had no effect on chick DLL1 expres- 
sion. An RFP reporter gene is inserted in the same constructs to detect 
electroporated cells. 

Antibody stainings and BrdU labelling. For BrdU labelling, embryos were incu- 
bated for 30 min with 50 pl of a 1 mg ml’ BrdU (Sigma) solution. Whole-mount 
antibody stainings were performed as described”. The following antibodies were 
used: rabbit polyclonals directed against chick myogenic regulatory factors MYF5 
and MYOD*; chick DLL1°’; and anti-RFP (Abcam); chicken polyclonals against 
EGFP (Abcam); rat polyclonals against the HA tag and anti-BrdU (Abcam). We 
also used monoclonals against the dermomyotome and dorsal neural tube marker 
PAX7 and against terminal myogenic differentiation marker MyHC (MF20) 
(obtained from the Developmental Studies Hybridoma Bank); and the neural- 
crest-specific monoclonal antibody HNK1 (provided by A. Eichmann). 

In situ hybridization. The following probes were used: chick HES1/cHairy2 
(ref. 40) and chick*” DLL1 and chick LFNG (provided by O. Pourquié), and 
400 bp cDNA clones coding for fragments of chick WNT1, WNT3A and a 1kb 
chick BMP4 probe". 
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Doxycyclin-mediated induction of NOTCH signalling. Eight hours after elec- 
troporation of pCIRX-rtTA-IRES-DsRed and pBI-HANICD-EGEP, doxycyclin 
(300 pl of a 0.1 pg ml! solution) was added onto the embryos, and it was either 
washed off after one hour with PBS for transitory upregulation of NICD, or left 
overnight, for permanent expression of this molecule. We verified that the res- 
ponse plasmid is completely silent before doxycyclin addition (that is, no EGFP 
expression, Supplementary Fig. 6a) while it is strongly and rapidly activated 6h 
after doxycyclin addition (Supplementary Fig. 6b) 

Time-lapse experiments and confocal analyses. Time-lapse experiments were 
performed essentially as described” on transverse slices (250 tm) of embryos. 
Embryo slices were filmed for 11h at 37°C with a confocal inverted Leica SP5 
microscope equipped with a resonant scanner, at the rate of one image stack per 
ten minutes. Confocal images were acquired transversally over a thickness of 
100 um; Supplementary Movie 1 corresponds to a fraction (10 1m thick) of the 
acquired images. Dorsal views shown in Figs 1-4 are projections of stacks of 
confocal images. Confocal stacks of images were visualized and analysed with 
the Imaris software suite. Cell countings were performed using the Improvision 
Volocity software suite. 

Quantifications and statistical analyses. Electroporation results in the transfec- 
tion of a portion of the targeted cell population, which is variable from embryo to 
embryo. To precisely evaluate the phenotypes obtained after electroporation of 
cell-autonomously acting cDNA constructs, the number of positive cells was 
compared to the total number of electroporated cells, recognized by an internal 
fluorescent reporter construct. On average, more than 2,300 cells were counted per 
point and the corresponding quantifications are shown in Figs 1-3 and 
Supplementary Figs 2-6. This mode of quantification could not be applied when 
constructs were electroporated in one tissue while the effects were evaluated in 
another, such as in experiments shown in Fig. 4 and Supplementary Figs 8-11. In 
this case, we report the number of embryos in which we observed a phenotype 
similar to the one that is illustrated in the figures, over the total number of 
electroporated embryos. Statistical analyses were performed using the GraphPad 
Prism software. Mann-Whitney non-parametric two-tail testing was applied to 
populations to determine the P values indicated in the figures. In each graph, 
columns correspond to the mean and error bars correspond to the standard 
deviation. ***P value < 0.0001. 
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Non-adaptive origins of interactome complexity 


Ariel Fernandez? & Michael Lynch? 


The boundaries between prokaryotes, unicellular eukaryotes and 
multicellular eukaryotes are accompanied by orders-of-magnitude 
reductions in effective population size, with concurrent amplifica- 
tions of the effects of random genetic drift and mutation’. The 
resultant decline in the efficiency of selection seems to be sufficient 
to influence a wide range of attributes at the genomic level in a non- 
adaptive manner’. A key remaining question concerns the extent to 
which variation in the power of random genetic drift is capable of 
influencing phylogenetic diversity at the subcellular and cellular 
levels’ *. Should this be the case, population size would have to be 
considered as a potential determinant of the mechanistic pathways 
underlying long-term phenotypic evolution. Here we demonstrate 
a phylogenetically broad inverse relation between the power of 
drift and the structural integrity of protein subunits. This leads 
to the hypothesis that the accumulation of mildly deleterious muta- 
tions in populations of small size induces secondary selection for 
protein-protein interactions that stabilize key gene functions. By 
this means, the complex protein architectures and interactions 
essential to the genesis of phenotypic diversity may initially emerge 
by non-adaptive mechanisms. 

Here we examine whether established gene orthologies reveal a role 
for drift in phylogenetic patterns of protein structural evolution. 
Although evolutionary change at the structural level is unlikely to 
destabilize greatly the native fold of an essential protein, as the com- 
plete loss of function would generally be unbearable, the drift hypo- 
thesis predicts a negative relation between population size (N) and the 
accumulation of mildly deleterious amino-acid substitutions. The fol- 
lowing examination of the structures of orthologous proteins from 
vastly different lineages suggests that the enhanced power of drift in 
eukaryotes (multicellular species in particular) results in a qualitative 
reduction in the stability of protein—water interfaces (PWIs) through 
the partial exposure of paired backbone polar groups (amides and 
carbonyls) that are otherwise protected in prokaryotes. In effect, the 
reduced efficiency of selection in small-N species encourages the accu- 
mulation of mild structural deficiencies in the form of solvent-accessible 
backbone hydrogen bonds (SABHBs), which lead to protein structures 
that are more ‘open’ and vulnerable to fold-disruptive hydration (Fig. 1a) 
and create protein—-water interfacial tension (PWIT; Supplementary 
Fig. 1)° by hindering the hydrogen-bonding capabilities of nearby 
water molecules. 

We argue that the emergence of unfavourable PWIs promotes the 
secondary recruitment of novel protein-protein associations that 
restore structural stability by reducing PWI. Under this hypothesis, 
complex organisms may frequently develop protein-protein interac- 
tions not as immediate vehicles for novel adaptive functions, but as 
compensatory mechanisms for retaining key gene functions. Once in 
place, such physical contact between interacting proteins may provide 
a selective environment for the further emergence of entirely novel 
protein-protein interactions underlying cellular and organismal com- 
plexities. Our suggestion that the hallmark of eukaryotic evolution, the 
origin of interactome complexity, may have arisen in part as a passive 
consequence of the enhanced power of drift reduces the need to invoke 
direct long-term selective advantages of phenotypic complexity’. 


To gain insight into the evolution of interactome complexity, we 
derived quantitative measures of the PWIT as indicators of potential 
molecular interactivity. To estimate the PWIT of a protein, we com- 
putationally equilibrated the protein structure in surrounding water, 
using the function g(r) to represent the time-averaged coordination 
(number of hydrogen bonds) associated with a water molecule at 
position r (Fig. 1a), and integrating over the entire protein surface 
all water molecules within a 10 A radius (the thickness of four layers 
of water molecules). Compared with bulk water (where g = 4), inter- 
facial water molecules may have reduced hydrogen-bonding oppor- 
tunities (g< 4) and often counterbalance these losses by interacting 
with polar groups on the protein surface. Thus the PWIT parameter 
integrates information on unfavourable local decreases in g and 
favourable polarization contributions from the protein to yield the 
free-energy cost, AGi, of spanning the protein-water interface 
(Methods). A high PWIT signals a high propensity for protein-protein 
associations, which reduce the PWI area. 

To validate the use of PWIT as a measure of interactivity, we 
examined an exhaustive catalogue of contact topologies for protein 
complexes with one to six subunits, with each topology being evaluated 
with one or more non-homologous complexes using structures in the 
Protein Data Bank (PDB) (Supplementary Table 1). For each complex, 
we computed the total protein-protein interface area after identifying 
the residues engaged in intermolecular contacts’. For each protein 
subunit, the protein-protein interface is contained within the PWI 
region that generates tension in the free subunit, and there is a tight 
correlation between the surface areas for both regions, implying that 
regions on the protein surface generating PWIT (i.e. those with g < 4 
for nearby water) actually promote associations (Supplementary Figs 1 
and 2a). Next, we verified that protein surface regions generating 
PWIT coincide with the affinity-contributing regions at protein- 
protein interfaces. To this end, we tested the value of PWIT as a 
promoter of protein associations by focusing on the interface for the 
1:1 human growth hormone (hGH)-receptor complex*® (Supplemen- 
tary Fig. 2b) for which the consequences of amino-acid substitutions 
have been extensively evaluated. Our analysis reveals a strong correla- 
tion between the change in PWIT induced by site-specific mutagenesis 
of interfacial residues and the association free-energy difference 
created by the alteration of the hormone-receptor interface (Sup- 
plementary Fig. 2c). 

Comparison of orthologous proteins engaging in different levels of 
homo-oligomerization in different species’ further supports the view 
that PWIT serves as a measure of the propensity for protein-protein 
association. The ratio of protein-protein interface areas (lower to 
higher degrees of complexation; Supplementary Table 2) exhibits a 
strong positive correlation with the ratio of PWITs for the respective 
free subunits (Fig. 1b). As complexes with higher degrees of oligo- 
merization arise from lower-order complexes, this implies that the 
degree of cooperativity among subunits correlates with the PWIT of 
the basic subunit. 

Hydrophobic regions on protein surfaces obviously contribute to 
PWIT, but analysis of proteins exhibiting association propensity 
(Supplementary Table 2) shows that the regions generating 73 + 5% 
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of the PWIT (Supplementary Information and Supplementary Fig. 1) 
arise from SABHBs. The resultant hydration of backbone polar groups 
(amides and carbonyls) causes a loss of coordination for local water 
molecules, which increases surface tension and creates an unstable 
PWI, as the cavities cannot accommodate a bulk-like water molecule’. 
As an example of how such a structural deficiency can be alleviated 
through a protein association, an isolated 4-subunit of the human 
haemoglobin tetramer has seven SABHBs that become protected 
within the tetrameric complex, such that the ratio (v) of SABHBs to 
total BHBs in the complex-associated subunit is the same as that in the 
natively monomeric unit for haemoglobin from the trematode Fasciola 
hepatica (Fig. 1c). 

To evaluate whether the accumulation of structural deficiencies of 
proteins is generally encouraged by random genetic drift, and in turn 
enhances the propensity for establishing protein complexes, we examined 
a set of 106 orthologous water-soluble proteins (sequence identity greater 
than 30%)'*”” with PDB-reported structures for at least two species. We 
considered 36 species with vastly different population sizes'”, each con- 
taining proteins in at least 90 of the 106 orthologous groups 
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Figure 1 | Structural deficiencies in soluble proteins promote protein 
associations. a, Hydration of exposed polar backbone induces interfacial 
tension by causing water molecules near the defect to relinquish part of their 
coordination (g< 4) relative to the level in surrounding bulk solvent (g = 4). 
White represents hydrogen atoms; red, oxygen; blue, nitrogen; black, carbon; 
the larger purple circles denote side chains for amino acids. Hydrogen bonds 
are denoted by dashed lines. Thick grey lines outline the external surface of the 
overall protein molecule, and the underlying structure represents two amino 
acids made adjacent by the protein architecture and bound by a hydrogen bond 
between the backbone amide (blue:white) of one amino acid and carbonyl 
(red:black) of the other. Water molecules are shown as angular red and white 
segments, with the coordination number g denoting the number of hydrogen 
bonds associated with a water molecule (g = 4 for bulk water; g< 4 for 
confined interfacial water). In the centre, the structure of the protein causes 
local exposure and unfavourable hydration of the polar backbone, whereas the 
absence of such local interactions between water molecules and the well- 
wrapped proteins on the left and right reduces interfacial tension (interfacial 
water is bulk-like, retaining the maximum coordination g = 4). b, Comparison 
of orthologous proteins with different levels of homo-oligomerization reveals 
that the PWIT is an indicator of the propensity for cooperative improvement/ 
refinement of protein function through complexation. The ratio of protein- 
protein interfaces (small to large) was determined for pairs of orthologous 
proteins with different levels of oligomerization in different species 
(Supplementary Table 2) and plotted against the ratio of PWITs for the 
respective free subunits. The tight correlation (r° = 0.94) reveals that 
interspecific differences in PWIT accompany differences in levels of 
oligomerization, thus providing a measure of potential allosteric or cooperative 
improvement of basic protein function. Complexes with cyclic rotational 
symmetry (C2, C3, ...) can further oligomerize into complexes with dihedral 
(D2, D3, ...) symmetry, as shown in the idealized diagrams in the lower right. 
For example, C2 complexes can dimerize into D2 complexes, trimerize into D3 
complexes, etc., whereas a D3 complex can also be obtained by dimerization of 
a C3 complex. For the protein-protein interface and PWIT ratios examined, the 
interface for the subunit in the complex with lower-order symmetry is 
compared with that in the complex with higher-order symmetry, yielding 
analyses based on protein pairs contrasted within three groupings: C2 versus 
D2, C2 versus D3, and C3 versus D3. c, The SABHB patterns from two 
haemoglobins with different oligomerization levels in their native states are 
compared. In the bottom panels, the protein backbone is represented by virtual 
bonds in blue joining «-carbons, with well-protected BHBs shown as light grey 
and SABHBs as green lines joining the o-carbons of the paired residues. The 
ribbon representations of the human complex and dissociated subunit (chain A 
in PDB.2DN2, left and centre, respectively) are included as aids to the eye, 
representing the structuring of the backbone in each subunit. The free subunit 
isolated from the tetramer in H. sapiens (PDB.2DN2, chain A, centre) has seven 
excess SABHBs (denoted by stars) when compared with the subunit within the 
tetrameric complex, where they are well-protected intermolecularly, alleviating 
interfacial tension. As a consequence of this better wrapping, the overall extent 
of structural deficiency (v value) for the subunit within the human complex is 
identical to that of the natively monomeric haemoglobin from the trematode F. 
hepatica (PDB.2VYW). This raises the possibility that the accumulation of 
structural deficiencies in the mammalian haemoglobin subunit promoted the 
emergence of an oligomeric association as a means of reducing excess 
interfacial tension. The structural displays were obtained by uploading the PDB 
text files into the program YAPview, a displayer of local backbone desolvation 
of soluble proteins that can be downloaded from the link “‘Dehydron Calculator’ 
at http://www.owlnet.rice.edu/~arifer/. 


(Supplementary Tables 3-5). Template-based three-dimensional struc- 
tures for orthologues lacking PDB-reported structures were constructed 
by homology threading’*”*, and evaluated, ranked and selected accord- 
ing to the energetic proximity between template and model’*. The accu- 
racy of this homology-based prediction of PWIT was determined with a 
test set of proteins with PDB-reported structures from two species, sub- 
jecting one member of each orthologous pair to homology threading 
through the other. Comparison of the indirect and direct estimates of 
PWIT demonstrates that when sequence identities are greater than 35%, 
the predicted PWIT diverges less than 10% from the more direct estimate 
for the same protein (Supplementary Fig. 3). 

For each protein structure, g(r) was obtained as described in Methods, 
and the relative propensities for protein association across orthologues 
were then determined by assessing differences in the free-energy cost 
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AGis among species. We estimated the relative complexation propen- 
sity M,,,, ofa protein in orthologue group j (1, ..., 106) from species n (1, 
., 36) by adopting Escherichia coli as a reference species (n = 1): 
n= [AG Din — (AG9; (AG j1 (1) 
With this index, Mia = 0 for all proteins in E. coli, and taxa with less 
well-wrapped proteins (and hence greater propensity for complexa- 
tion), have positive values. 

The mean value of species-specific estimates of M,.,,, over all proteins 
evaluated is negatively correlated with the approximate effective popu- 
lation sizes of species (Fig. 2a), given that the average ranking of the 
latter is prokaryotes > unicellular eukaryotes > invertebrates > verte- 
brates and land plants’. A specific example of a trend towards increas- 
ing structural openness with reduced population sizes is illustrated in 
Fig. 2b, where the SABHB patterns and v values for orthologues of the 
enzyme superoxide dismutase are compared across three species. 
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Figure 2 | Structural degradation enhances PWIT and promotes protein 
interactivity in species with small population sizes. a, Potential for 
interactome complexity of 36 species with diverse population sizes 
(Supplementary Table 2), relative to E. coli. To highlight the relative power of 
random genetic drift, bars are colour-coded to reflect groupings of species in 
broad population-size categories. b, Overall structural deficiency of orthologues 
of the enzyme superoxide dismutase (Mn), revealing a progressive 
accumulation of SABHBs in the orthologues of the bacterium E. coli, the 
nematode Caenorhabditis elegans and H. sapiens. The upper ribbon 
representations illustrate the structural conservation across orthologues 
(respective PDB accession numbers 3ot7, 3dc6, 2adq). The conventional colour 
coding is red, blue, magenta and light blue for helix, B-strand, loop and turn, 
respectively. c, Average structural deficiency (v value) of protein orthologues 
for intracellular and free-living bacterial species. Species identities, progressing 
from left to right are as follows: «-Proteobacteria—Rickettsia typhi, Orientia 
tsutsugamushi, Anaplasma centrale str. Israel, Wolbachia sp. wRi, 
Rhodospirillum centenum SW, Magnetospirillum magneticum, Silicibacter 
TM1040, Erythrobacter litoralis; y-Proteobacteria—Buchnera aphidicola, 
Wigglesworthia brevipalpis, Candidatus Blochmannia pennsylvanicus, 
Marinomonas MWYLI, E. coli, Pseudomonas aeruginosa. Only proteins with 
orthologues across the full set of species within each group were considered for 
analysis (Supplementary Tables 6 and 7). 


The results from Fig. 2a and an additional analysis (Supplementary 
Fig. 4) support the hypothesis that large organisms with small popu- 
lation sizes experience a significant enough increase in the power of 
random genetic drift to magnify the accumulation of mild structural 
deficiencies in the form of SABHBs, resulting on average in proteins 
with a more solvent-exposed or ‘open’ structure. By contrast, muta- 
tions to SABHBs are more frequently excluded by selection in species 
with larger population sizes (for example, prokaryotes). Thus, because 
SABHBs are the main determinants of interfacial tension (Supplemen- 
tary Fig. 1), the proteins of large organisms have a greater inherent 
tendency to form novel protein-protein associations (Fig. la). This 
suggests that increases in protein-network complexity in multicellular 
species may in part owe their origins to modifications to the intracel- 
lular selective environment induced by non-adaptive structural degra- 
dation of individual proteins. 

One concern with the preceding interpretation is the order of 
events: does an initial degradation of architectural integrity of indi- 
vidual proteins in response to random genetic drift induce secondary 
selection for the recruitment of interacting partners, or does the emer- 
gence of cellular complexity (and increased protein interactivity) pre- 
cede secondary changes in protein sequence to accommodate such 
interactions? One way to evaluate this matter is to compare proteins 
from related species that have experienced relatively recent diver- 
gences in effective population sizes but no major modifications in 
intracellular complexity or emergence of multicellularity. 

To achieve this task, we compared orthologous genes from endo- 
symbiotic/intracellular bacteria and their free-living relatives, as the 
former are thought to have experienced substantial reductions in effec- 
tive population sizes’®. Previous suggestions that intracellular bacteria 
experience elevated levels of random genetic drift have been based on 
ratios of substitution rates at silent and replacement sites, which can be 
biased indicators of the efficiency of selection if there is selection on 
silent sites. Although the lack of protein structural information for 
endosymbiotic species requires a sequence-based identification of 
SABHBs derived from reliable scores of native disorder propensity 
(Methods), the resultant analyses are broadly consistent with the 
hypothesis that an increase in the power of drift in microbes 
encourages the accumulation of structural defects in protein architec- 
ture (Fig. 2c). Free-living species, with larger effective population sizes, 
have consistently smaller v values for orthologous genes in both «- and 
y-Proteobacteria. (Application of the same sort of analysis of disorder 
propensity across a set of 105 species and 541 proteins corroborates 
this result (Supplementary Figs 5-7).) 

Taken together, our analyses support the hypothesis that the range 
of population sizes experienced by natural populations is sufficient to 
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induce significantly different patterns of evolution at the level of protein 
architecture. The resultant changes in the intracellular environment 
in small-N species provides an opportunity for the recruitment of 
stabilizing protein-protein interactions, yielding a plausible mech- 
anism for the emergence of molecular complexities before their 
exploitation in phenotypic divergence”’’. This hypothesis does not 
deny a potentially significant role for natural selection in using such 
novelties subsequent to their establishment, nor does it deny the fact 
that intramolecular compensatory mutations can alleviate some 
structural defects associated with SABHBs. However, our results do 
raise questions about the necessity of invoking an intrinsic advantage 
to organismal complexity, and provide a strong rationale for expand- 
ing comparative studies in molecular evolution beyond linear 
sequence analysis to evaluations of molecular structure. 


METHODS SUMMARY 


We determined the propensity of proteins to be engaged in associations that 
reduce the PWI by computing the PWIT. This thermodynamic parameter gives 
AGig the free-energy cost of spanning the PWI. The PWIT is computed as 


AGie = ¥2|{a|Vg|? — |PLg(x)]|7}dr, (2) 


where the term /2a|V¢| 2, with a = 9.02 mJ m | at T = 298 K (Methods), accounts 
for tension-generating reductions in water coordination, and the polarization 
P[g(r)] accounts for dipole-electrostatic field interactions (Methods). For a given 
protein structure or template-based structural model, the field g = g(r) used in 
the numerical integration of equation (2) was determined by equilibrating the 
water-embedded structure within an isothermal-isobaric (NPT) ensemble (with 
fixed parameters N = number of particles, P= pressure and T= temperature; 
Methods)'*'*!°. From structural coordinates, we determined the structural 
deficiencies (SABHBs)”” that generate 73+5% of the PWIT (Supplementary 
Information). We examined 106 groups of orthologous proteins identified using 
OrthoMCL'"” for which there are PDB representatives from at least two species 
(usually E. coli and Homo sapiens, Supplementary Tables 3-5). We considered 36 
representative species, each containing proteins in at least 90 of the 106 orthologue 
groups. Template-based three-dimensional structures for orthologues lacking a 
PDB-reported structure* were constructed using MODELLER", with side chains 
directly positioned with SCWRL”’. The template and resulting model were evaluated, 
ranked and finally selected using ProSA’*. The accuracy of homology models is 
shown in Supplementary Fig. 3. In cases where orthologous structural templates 
were unavailable, like the comparison of endosymbionts with free-living species, a 
sequence-based inference of SABHBs was performed based on an established anti- 
correlation between backbone protection and disorder propensity (Supplementary 
Fig. 5)°*. The cross validation of homology- and disorder-based estimations of v 
values is given in Supplementary Fig. 6. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Computation of PWIT. The parameter a in equation (2) is obtained from the 
interfacial tension of a large non-polar sphere with radius @ in the limit 
0/1nm— ~. Thus we get a = 9.02 mJ m | = limitg,; nm «l)(41067)/[ ¥a|Vgl7dr], 
where y=72mJm* is the bulk surface tension of water at 298K, and 
||Vgl?dr = O(6) since Vg 0 only in the vicinity of the interface. To determine 
the g-dependence of polarization P = P(r), we adopt the Fourier-conjugate fre- 
quency space (@ space) and represent the dipole correlation kernel K,(@) and the 
electrostatic field E = E(r) in this space. In contrast with other treatments”, we 
note that P and E are indeed proportional but the proportionality constant is @ 
dependent”. Thus, in @ space, we get 


F(P)(@) = K,(@)F(E)(@), (3) 
where F denotes three-dimensional Fourier transform F(f)(@) = (2m) ~~” 7g 
f(r)dr, and the kernel K,,(@) is the Lorentzian K,(@) = (@ — &)/(1 + (c(r)c)"|@|*), 
with t(r)c = position-dependent dielectric relaxation scale ~ 3 cm for t = tp~100 
ps (c= speed of light), «, = bulk permittivity and ¢,= vacuum permittivity. 
Because P(r) satisfies the Debye relation V.(é,E + P)(r) = p(r), where p(r) = 
charge density, equation (4) yields the following equation in r space”: 


V.F '(K)(r— r')E(r’)dr’)] = p(n), (4) 


with K(@) = + K,(@). The convolution [Fe —r)E(r’)dr’ captures the 
correlation of the dipoles with the electrostatic field. Note that equation (4) is 
not the Poisson-Boltzmann equation, which requires a proportionality between 
the fields E and P under the ad hoc assumption K(@)=constant. 

Upon water confinement, the dielectric relaxation undergoes a frequency red- 
shift arising from the reduction in hydrogen-bond partnerships that translates to a 
reduction in dipole orientation possibilities. Thus, at position r, the relaxation time 
is t= tpexp(B(g(r))/kgT), where the kinetic barrier B(g(r)) = —kgTin(g(r)/4) 
yields t(r) = tp(g(r)/4) |. Thus, for charge distribution, 


PUL) = Lin © ATG O(E — Lin)s (5) 


with L = set of charges on the protein surface labelled by index m, the g-dependent 
polarization is obtained from equation (4) (Supplementary Information): 


P(r) = |F *(K,)(r — r)E(r’)dr’ 
= (20) EZ ep |de’F '(K,)(t — r')Vy|doe © ~*4ngn/[|@|?K(@)]. (6) 


Spatially dependent coordination g= g(r). The time-averaged scalar field 
g = g(r) was obtained from classical trajectories generated by molecular dynamics. 
The computations started with the PDB structure of a free (uncomplexed) protein 
molecule embedded in a pre-equilibrated cell of explicitly represented water mole- 
cules and counterions'*'°. The molecular-dynamics trajectories were generated by 
adopting an integration time step of 2 fs inan NPT ensemble with box size 10° nm* 
and periodic boundary conditions**. The box size was calibrated so that the 
solvation shell extended at least 10 A from the protein surface at all times. The 
long-range electrostatics were treated using the particle mesh Ewald summation 
method’’. A Nosé-Hoover thermostat”* was used to maintain the temperature at 
300K, and a Tip3P water model with the optimized potential for liquid simula- 
tions (OPLS) force field was adopted’*’’. A barostat scheme was maintained 
through a dedicated routine with the pressure held constant at 1 atm. using a 
weak-coupling algorithm”. After equilibration for 300 ns, g values averaged over 
a time span of 100 ns were determined for each point in space. 

PWIT as promoter of protein-protein associations. The PWIT computed using 
equations (2) and (6) is generated by interfacial hotspots of red-shifted dielectric 
relaxation (g(r) <4, t(r) > 1). The most common spots involve hindered polar 
hydration generated by SABHBs (Fig. la). Taken collectively, the SABHBs con- 
tribute 73 +5% to the interfacial tension (Supplementary Information). The 
results are validated by showing that the inferred patches of interfacial tension 
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promote protein associations, a conclusion supported by the tight correlation 
(7° = 0.83) between the total area of surface patches begetting PWIT (increasing 
the value of the integral in equation (2)) in free complex subunits, and the total 
protein-protein interfacial area of protein complexes (Supplementary Fig. 2a). 
The relevance of PWIT asa molecular determinant of protein-protein interactions 
is further validated by showing that inferred tension patches actually coincide with 
hotspots at complex interfaces experimentally identified by mutational scanning 
(Supplementary Fig. 2b, c). 

Identification of SABHBs in soluble proteins. The extent of protection of a 
backbone hydrogen bond, ¢, was computed directly from PDB structural coordi- 
nates by determining the number of side-chain non-polar groups contained within 
a desolvation domain around the bond’”*. This domain was defined as two 
intersecting spheres of fixed radius (approximate thickness of three water layers) 
centred at the «-carbons of the residues paired by the hydrogen bond. In structures 
of soluble proteins, backbone hydrogen bonds are protected on average by 
€ = 26.6 + 7.5 non-polar groups for a desolvation sphere of radius 6 A. SABHBs 
lie in the tails of the distribution: that is, their microenvironment contains 19 or 
fewer non-polar groups (¢ = 19), so their ¢ value is below the mean minus one 
standard deviation. 

Sequence-based_ identification of SABHBs. SABHBs represent structural 
vulnerabilities that have been characterized as belonging to a twilight zone 
between order and native disorder. This characterization is justified by a strong 
correlation between intramolecular hydrogen-bond protection, ¢, and propensity 
for structural disorder (f3) (Supplementary Fig. 5). The correlation reveals that the 
inability to exclude water intramolecularly from pre-formed hydrogen bonds is 
causative of the loss of structural integrity. The disorder propensity is accurately 
quantified by a sequence-based score generated by the program PONDR-VLXT”, 
a predictor of native disorder that takes into account residue attributes such as 
hydrophilicity, aromaticity and their distribution within the window interrogated. 
The disorder score (0 = fq = 1) is assigned to each residue within a sliding window, 
representing the predicted propensity of the residue to be in a disordered region 
(fa = 1, certainty of disorder; fa = 0, certainty of order). Only 6% of 1,100 non- 
homologous PDB proteins gave false-positive predictions of disorder in sequence 
windows of 40 amino acids”*°. The strong correlation (Supplementary Fig. 5) 
between the disorder score of a residue and extent of protection of the hydrogen 
bond engaging the residue (ifany) provides a sequence-based method of inference 
of SABHBs and supports the picture that such bonds belong to an order-disorder 
twilight zone”. Thus SABHBs can be safely inferred in regions where the disorder 
score lies in the range 0.35 = fy < 0.95, which corresponds to a marginal BHB 
protection with 7 = ¢ = 19 (Supplementary Fig. 5). 

Evaluation of homology models. The homology models based on template PDB 
structures from orthologous proteins were evaluated, ranked and ultimately 
selected using ProSA"’, based on the minimization of (Zmnoa — Ztemp)/Ztemp» Where 
Zmod and Ziemp are the Z scores of model and template. The Z score of a structure 
or template-based model is the energetic gap between the structure and an average 
over an ensemble of random conformations for the protein chain". 
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Control of visual cortical signals by prefrontal 


dopamine 


Behrad Noudoost! & Tirin Moore! 


The prefrontal cortex is thought to modulate sensory signals in 
posterior cortices during top-down attention’”, but little is known 
about the underlying neural circuitry. Experimental and clinical 
evidence indicate that prefrontal dopamine has an important role 
in cognitive functions’, acting predominantly through D1 recep- 
tors. Here we show that dopamine D1 receptors mediate prefrontal 
control of signals in the visual cortex of macaques (Macaca 
mulatta). We pharmacologically altered D1-receptor-mediated 
activity in the frontal eye field of the prefrontal cortex and mea- 
sured the effect on the responses of neurons in area V4 of the visual 
cortex. This manipulation was sufficient to enhance the mag- 
nitude, the orientation selectivity and the reliability of V4 visual 
responses to an extent comparable with the known effects of top- 
down attention. The enhancement of V4 signals was restricted to 
neurons with response fields overlapping the part of visual space 
affected by the D1 receptor manipulation. Altering either D1- or 
D2-receptor-mediated frontal eye field activity increased saccadic 
target selection but the D2 receptor manipulation did not enhance 
V4 signals. Our results identify a role for D1 receptors in mediating 
the control of visual cortical signals by the prefrontal cortex and 
suggest how processing in sensory areas could be altered in mental 
disorders involving prefrontal dopamine. 

Dopamine D1 receptors (D1Rs) are expressed by about one-quarter 
of all neurons in the prefrontal cortex and are localized primarily in 
superficial and deep layers**. Microiontophoretic application of the 
selective D1R antagonist SCH23390’ at certain doses can increase the 
persistent, working-memory-related component of single-neuron 
activity in the dorsolateral prefrontal cortex**”. Given the role of the 
prefrontal cortex in visual attention’’, we hypothesized that D1Rs 
might also mediate the top-down control of visual cortical signals by 
the prefrontal cortex. If so, then changes in D1R-mediated prefrontal 
cortex activity might be sufficient to modulate signals in the posterior 
visual cortex, similar to the modulation observed during selective 
attention’’. The prefrontal cortex’s influence on the visual cortex is 
achieved in part by the frontal eye field (FEF)’""’, an oculomotor area 
within the posterior prefrontal cortex. The FEF has a well-established 
role in saccadic target selection’, but recent evidence also implicates 
this area in the control of spatial attention®’*"*. To test our hypothesis, 
we locally infused’® small volumes (0.5-1 il) of SCH23390 into sites in 
the FEF of macaques performing fixation and eye movement tasks 
(Fig. la, b and Supplementary Fig. 1). We measured the effects of 
the FEF infusion on target selection using a free-choice saccade task”. 
In this task, monkeys were rewarded for choosing between two saccadic 
targets, one located within the FEF response field and one in the opposite 
hemifield. In the same experiment, we recorded the visual responses of 
single neurons in area V4 during fixation. In particular, we recorded 
neurons with response fields that overlapped the FEF response field. 
Thus, we tested the effects of the DIR manipulation on both visual 
cortical signals and saccadic target selection. 

We found that altering D1R-mediated activity at FEF sites increased 
the tendency of monkeys to choose targets appearing within the FEF 
response field (Fig. 1b). In the free-choice task, the temporal onset of 


the two targets was systematically varied such that the FEF response 
field stimulus could appear earlier or later than the opposite stimulus. 
A monkey’s tendency to select the FEF response field target could then 
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Figure 1 | Local manipulation of D1R-mediated activity in the FEF during 
single-neuron electrophysiology in area V4. a, Lateral view of the macaque 
brain depicting the location of a recording microsyringe in the FEF and of 
recording sites in area V4. Bottom diagram shows saccades evoked via electrical 
microstimulation at the infusion site (red traces) and the response field (RF, 
green ellipse) of a recorded V4 neuron in an example experiment. b, Double- 
target saccade task used to measure the monkey’s tendency to make saccades to 
a target within the FEF response field versus one at an opposite location across 
varying temporal onset asynchronies. Positive asynchrony values denote earlier 
onset of FEF response field targets. Bottom plot shows the leftward shift in the 
PES, indicating more FEF response field choices, after infusion of SCH23390 
into an FEF site. c, Visual responses of a V4 neuron with a response field that 
overlapped the FEF response field, measured during passive fixation. The plot 
shows mean + s.e.m of visual responses to a bar stimulus presented at 
orthogonal orientations before (grey) and after (red) the infusion of SCH23390 
at the FEF site. 
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be measured as the temporal onset asynchrony required for an equal 
probability of selecting either stimulus; we termed this the point of 
equal selection (PES). In the example experiment shown, the monkey 
chose the FEF response field target as often as the opposite target when 
the former appeared 76 ms earlier (PES = 76). However, infusion of 
SCH23390 (0.85 il) into the FEF reduced the PES by 23 ms (binary 
logistic regression, P = 0.007), thereby increasing the proportion of 
FEF response field target choices. 

In the same experiment, we also measured the responses of V4 
neurons to oriented bars during fixation in a separate task (Fig. 1c 
and Supplementary Methods). We found that the increase in target 
selection after the SCH23390 infusion was accompanied by an 
enhanced V4 neuronal response to oriented bars appearing within 
the overlapping V4 and FEF response fields. The example neuron 
shown was selective for orientation: it responded more to the 45° than 
to the 135° bar stimulus (P< 10 °). After the infusion of SCH23390, 
there was a significant increase in the overall visual response of this 
neuron as well as a significant increase in the differential response to 
the two orientations (two-way analysis of variance, SCH23390 effect, 
P<10 °; SCH23390-orientation interaction, P< 10°). Thus, the 
local perturbation of D1R-mediated FEF activity not only caused the 
monkey to select FEF response field stimuli as saccade targets more 
frequently, it also led to enhanced and more selective visual responses 
of a V4 neuron representing the same part of space. 

We studied the visual responses of 37 V4 neurons with response 
fields that overlapped the response fields of FEF infusion sites. The 
average (mean + s.e.m.) distance between V4 response field and FEF 
response field centres was 0.71 + 0.07 degrees of visual angle (d.v.a.) 
(Fig. 2a). As with the example neuron, we measured the responses of all 
neurons to oriented bars appearing in their response field during a 1s 
fixation period (Fig. 2b). Before the onset of the visual stimulus, there 
was a significant elevation in baseline activity after the D1R manipula- 
tion (A baseline = 0.077 + 0.186, P = 0.030). In addition to the base- 
line increase, the visually driven response of V4 neurons was enhanced 
by 17% above the control response (A response = 0.121 + 0.054, 
P=0.018). We confirmed that the enhancement in the visual res- 
ponse was not due to systematic changes in eye position during stimu- 
lus presentation (Supplementary Fig. 2). The enhancement of the 
visual response was independently significant for both preferred 
(A preferred = 0.264 + 0.087; P= 0.004) and non-preferred stimuli 
(A non-preferred = 0.132 + 0.062; P= 0.032). There was also an 
increase in the response difference between the preferred and non- 
preferred orientations (Aresponse difference = 0.132 + 0.041; 
P= 0.004) (Supplementary Fig. 3), indicating an increase in orienta- 
tion selectivity. To measure selectivity more quantitatively, we used a 
receiver-operating characteristic (ROC) analysis to quantify the degree 
to which each neuron’s responses could be used to judge stimulus 
orientation (Fig. 2c). This analysis confirmed that V4 neurons were 
more orientation selective after changes in D1R-mediated FEF activity 
(A ROC area = 0.035 0.009, P< 10 °). The enhancement in the 
magnitude and selectivity of the V4 response was accompanied by a 
decrease in the trial-to-trial variability of visual responses. We mea- 
sured the variability of V4 responses across trials by computing the 
Fano factor, which is the variance in the spike count divided by 
its mean. We found that the Fano factor of V4 responses was reduced 
after the DIR manipulation (AFF = —0.105 + 0.045; P<10~*) 
(Fig. 2d and Supplementary Fig. 4). All three V4 effects were com- 
parable in magnitude to the known effects of top-down attention and 
consistent with a multiplicative increase in the gain of visual signals'*”” 
(Fig. 2e). 

The effect of the DIR manipulation on saccadic target selection 
was highly consistent across the two monkeys tested. In 21 double- 
target experiments, the PES was reduced in every case (Fig. 3a). The 
mean PES shifted in favour of the FEF response field stimulus by an 
average of 27ms (APES = —26.934 + 3.086, P< 10°), signifi- 
cantly increasing the overall proportion of FEF response field choices 
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Figure 2 | Manipulation of D1R-mediated activity enhances V4 visual 
signals. a, Average vectors of saccades evoked at all FEF sites that overlapped 
V4 response fields (left panel). The distribution of distances between the 
endpoints of evoked saccades and the centres of overlapping V4 response fields 
for 37 V4 neurons is shown in the right panel. b-d, The mean normalized 
response magnitude (b), orientation selectivity (c) and response variability 
(Fano factor) (d) of V4 neurons before (grey) and after (red) microinfusion of 
SCH23390 into the FEF. Means + s.e.m. are shown within a 100-ms moving 
window measured during the 1-s response field stimulus presentation (top 
event plot). Histograms to the right of each response profile show the 
distributions of modulation indices for response magnitude (b), selectivity 
(c) and variability (d) across the population of neurons. e, Comparison of V4 
response modulation after the SCH23390 infusion for preferred and non- 
preferred response field stimuli. 


(chi-squared = 80.60, P< 10 °) and thus indicating that the DIR 
manipulation increased the monkeys’ tendency to target FEF response 
field stimuli. The increase in target selection was apparent across a 
range of drug dosages (Supplementary Fig. 5). In addition to the 
D1R manipulation, we tested the effects of the D2R agonist quinpirole. 
Previous studies using this drug found that it does not affect persistent 
activity but rather increases saccade-related activity within the dorso- 
lateral prefrontal cortex’. We found that local manipulation of D2R- 
mediated FEF activity, like the DIR manipulation, increased the selec- 
tion of FEF response field targets (Fig. 3a). The PES shifted by an 
average of 22ms (A PES = —21.993 + 6.758, P = 0.010), increasing 
the proportion of FEF response field choices (chi-squared = 13.86, 
P<10° ). Thus, the DIR- and D2R-mediated manipulations of FEF 
activity resulted in equivalent increases in saccadic target selection. 
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Figure 3 | Changes in saccadic target selection and V4 visual responses. 

a, Scatter plot shows the consistent increase in FEF response field target choices 
(decrease in PES) after manipulation of both D1R-mediated (circles) and D2R- 
mediated (triangles) FEF activity. For both drug effects, the increase in FEF 
response field target selection was constant across a range of control PES values; 
the slope in the linear fit did not differ significantly from unity in either case 


Despite the increase in target selection, manipulation of D2R- 
mediated activity in the FEF failed to enhance the responses of V4 
neurons. We found no significant effect on the visual response mag- 
nitude, orientation selectivity or response variability of V4 neurons 
after the D2R manipulation (A response = 0.001 + 0.048, P = 0.999; 
AROC area = —0.007 + 0.010, P=0.426; AFF = 0.037 + 0.052, 
P= 0.338; n = 15) (Fig. 3b). Moreover, the changes in these measures 
were all significantly different from the changes we observed after 
the DIR manipulation (A responsep or < A responsep;p, P = 0.045; 
A selectivitypor <A selectivityp;p, P= 0.011; AFFpop > AFFpir, 
P=0.019). Thus, the equivalent effects of DIR and D2R manipula- 
tions on saccadic target selection were accompanied by contrasting 
effects in V4, with the enhancement of visual signals being specific 
to D1R-mediated activity. We also found that this enhancement was 
confined to V4 neurons with response fields that overlapped the FEF 
response field. For V4 neurons with response fields that did not overlap 
the FEF response field (mean distance between V4 response field and 
FEF response field = 9.00 + 0.86 d.v.a.; n = 15), we found no signifi- 
cant effect of the DIR manipulation on response magnitude 
(A response = —0.028 + 0.087, P=0.9780), orientation selectivity 
(AROC area= —0.017+ 0.010, P=0.187) or the Fano factor 
(A FF = 0.010 + 0.043, P = 0.688). Of note, the changes in these measures 
were all significantly different from the changes observed in neurons 
with overlapping response fields (Aresponse;on-overlap < A responsepyerlap» 
P=0.044; A selectivitynon-overlap < A selectivity overtap» P= 0.007; 
A FF pon-overlap > A FF overlap» P = 0.034) (Fig. 3b). Thus, the enhance- 
ment in visual cortical signalling produced by manipulation of D1R- 
mediated FEF activity was spatially specific. 

Wealso tested the effect of complete inactivation of FEF sites on the 
responses of V4 neurons with overlapping response fields. Previous 
studies have shown that local inactivation of the FEF disrupts saccadic 
target selection and impairs attention'””’. We therefore wondered if 
inactivation could reduce the components of V4 responses that were 
enhanced by the DIR manipulation. We locally inactivated FEF sites 
using the GABA, (y-aminobutyric acid subtype A) receptor agonist 
muscimol. Unlike the sparsely expressed D1Rs, GABA, receptors are 
expressed by all neurons in all cortical layers”. As in previous studies, 
local inactivation of FEF sites with muscimol decreased the targeting of 
FEF response field stimuli. It also significantly reduced V4 orientation 
selectivity (AROC area = —0.030+0.011, P=0.003; n=33). 
However, the inactivation did not change the response magnitude or 
variability of V4 neurons (A response = 0.016 + 0.061, P = 0.809; 
AFF = —0.002 + 0.023, P=0.921) (Fig. 3b). Thus, in contrast to 
the DIR manipulation which altered all three components of V4 
activity, complete inactivation altered only one. All three inactiva- 
tion effects were significantly different from the DIR effects 
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(D1R: slope = 0.96, P = 0.552; D2R: slope = 0.97, P = 0.502). b, Changes in 
response magnitude, orientation selectivity and response variability (Fano 
factor) after each drug manipulation. Changes shown are mean differences 
from pre-infusion values. Error bars denote s.e.m.; *, P< 0.05; **, P< 0.01; 
***) P<0.001. 


(A responseyyuscimol < A responsepip; P = 0.024; A selectivitymuscimol < 
Aselectivitypiz, P<10 7°; AFFmuscimot> AFFpiz) P= 0.007). 
Although the reduction in orientation selectivity is consistent with 
previous electrical microstimulation studies’* and with the effects of 
inactivation on orientation discrimination”’, the lack of a reduction in 
response magnitude may seem inconsistent. However, we suggest that 
this difference is due to variation between experimental paradigms 
(Supplementary Discussion). Finally, we tested for any effect of 
vehicle (saline) infusion into the FEF. The infusion of saline failed to 
change the response magnitude, selectivity or variability of V4 
neurons (A response = 0.018 + 0.048, P= 0.380; AROC area = 
—0.010 + 0.013, P=0.569; AFF=—0.035+0.061, P=0.179; 
n = 12) (Fig. 3b). All three measures were significantly different from 
the DIR effects (Aresponsegajine << Aresponsep;p, P= 0.045; 
A selectivity.aiine < A selectivityp;p, P= 0.013; A FFyajine > A FFpip 
P=0.009). 

Our results identify prefrontal D1Rs as a component of the neural 
circuitry controlling signals in the visual cortex. Manipulation of DIR- 
mediated FEF activity was sufficient to enhance the magnitude, 
reliability and visual selectivity of neuronal responses in area V4, three 
known effects of visual attention. The observed enhancement might 
account for the benefits in visually guided behaviour that accompany 
attentional deployment (Supplementary Fig. 6), although a causal link 
between attentional modulation of visual cortical signals and visual 
perception remains to be established. We have demonstrated that 
visual representations in posterior areas can be altered merely by 
changes in dopamine tone in the prefrontal cortex. Given the complex 
effects of dopamine through D1Rs, one might predict that at 
‘optimum’ dopamine levels’, optimal top-down control of visual cor- 
tical signals would be achieved. 

The circuitry underlying top-down control of the visual cortex 
probably involves several different neuromodulators” and an array 
of different brain structures”. Our results show that this circuitry 
involves prefrontal dopamine acting via D1Rs. In the dorsolateral 
prefrontal cortex, dopamine D1Rs are thought to modulate recurrent 
glutamatergic connections, thereby influencing activity related to 
working memory in this area”**’. This study shows that D1Rs contri- 
bute to the FEF’s control of visual signals by an analogous mechanism, 
namely by modulating long-range, recurrent connections between the 
FEF and the visual cortex (Supplementary Fig. 7). Because FEF neu- 
rons in the superficial layer are reciprocally connected with neurons in 
V4’’, dopaminergic modulation of these connections via D1Rs in the 
superficial layer would be expected to mediate the FEF’s control of V4 
signals. The specificity of V4 effects to D1Rs, rather than D2Rs, might 
be explained by the relative absence of D2Rs in superficial layers of 
the prefrontal cortex**®. The equivalent effects of DIR and D2R 
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manipulations on target selection might be explained by the presence 
of both receptor subtypes in infragranular layers of the cortex* °, where 
layer-V FEF neurons project to the superior colliculus”. 

Impairments in saccadic control are prominent among the impair- 
ments exhibited in attention deficit/hyperactivity disorder (ADHD)”**. 
The observed influence of prefrontal D1Rs on saccadic target selection 
and visual cortical signals, combined with their known influence on 
persistent activity, may explain the behavioural links between saccadic 
control, attention and working memory” and the coincidence of their 
corresponding impairments in ADHD”. 


METHODS SUMMARY 


The effects of pharmacological perturbation of FEF activity on target selection and 
the visual responses of V4 neurons were studied in three macaques (Macaca 
mulatta) performing fixation and eye movement tasks (Supplementary 
Methods). All experimental procedures were in accordance with the National 
Institutes of Health guide for the care and use of laboratory animals and with 
the Society for Neuroscience guidelines and policies. They were also approved 
by the Stanford University animal care and use committee. Eye position was 
monitored with a scleral search coil. In each experiment, we infused small volumes 
of drug into sites in the FEF through a surgically implanted titanium chamber 
overlying the arcuate sulcus using a custom-made recording microinjectrode. We 
identified FEF sites by eliciting short-latency, fixed-vector saccadic eye movements 
with trains (50-100 ms) of biphasic current pulses (=50 A; 250 Hz; 0.25 ms 
duration). In the same experiment, recordings from V4 neurons were made 
through a chamber overlying the prelunate gyrus. Response fields of V4 neurons 
were all located in the lower quadrant of the contralateral hemifield (<12° eccent- 
ricity). The position of the FEF microinjectrode was adjusted so that the saccade 
elicited by FEF microstimulation shifted the monkey’s gaze either to within the V4 
response field (overlapping) or far outside it (non-overlapping). 
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COP1 is a tumour suppressor that causes degradation 
of ETS transcription factors 
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The proto-oncogenes ETV1, ETV4 and ETV5 encode transcription 
factors in the E26 transformation-specific (ETS) family, which 
includes the most frequently rearranged and overexpressed genes 
in prostate cancer’ *. Despite being critical regulators of develop- 
ment, little is known about their post-translational regulation. 
Here we identify the ubiquitin ligase COP1 (also known as 
RFWD2) as a tumour suppressor that negatively regulates ETV1, 
ETV4 and ETV5. ETV1, which is mutated in prostate cancer more 
often, was degraded after being ubiquitinated by COP1. Truncated 
ETV1 encoded by prostate cancer translocation TMPRSS2:ETV1 
lacks the critical COP1 binding motifs and was 50-fold more stable 
than wild-type ETV1. Almost all patient translocations render 
ETV1 insensitive to COP1, implying that this confers a selective 
advantage to prostate epithelial cells. Indeed, COP1 deficiency in 
mouse prostate elevated ETV1 and produced increased cell pro- 
liferation, hyperplasia, and early prostate intraepithelial neoplasia. 
Combined loss of COP1 and PTEN enhanced the invasiveness of 
mouse prostate adenocarcinomas. Finally, rare human prostate 
cancer samples showed hemizygous loss of the COP1 gene, loss 
of COP1 protein, and elevated ETV1 protein while lacking a trans- 
location event. These findings identify COP1 as a tumour sup- 
pressor whose downregulation promotes prostatic epithelial cell 
proliferation and tumorigenesis. 

Mass spectrometry showed that ETV1, ETV4, and ETV5 co- 
immunoprecipitated specifically with Flag-tagged COP 1 from a mouse 
kidney epithelial cell line (Supplementary Fig. 1a). Known COP1- 
interacting proteins DET1 (ref. 5) and TRIB3 (ref. 6) also were co- 
immunoprecipitated (Supplementary Table 1). The ET Vs each contain 
three potential COP1-binding motifs, with endogenous COP1 and 
ETV1 interacting in LNCaP prostate cancer cells (Supplementary 
Fig. 1b-d). An inverse correlation between ETV1 and COP1 proteins 
in prostate cancer cell lines suggested ETV1 might be a COP1 substrate 
(Fig. la). For example, PC3 cells and their derivatives showed COP1 
loss (data not shown), lacked detectable COP1, but did express ETV1. 
By contrast, COP1-expressing BPH1, BPH1025 and LNCaP cells 
lacked ETV1 protein, despite LNCaP cells containing more ETV1 
mRNA than PC3 cells (Fig. 1a)’. Consistent with the notion that 
COP1 in LNCaP cells rendered newly synthesized ETV1 unstable, 
ETV1 protein, but not mRNA, was increased in LNCaP cells by either 
proteasome inhibition with MG-132 or Bortezomib (Fig. 1b) or siRNA 
knockdown of endogenous COPI (Fig. 1c). The latter increased the 
half-life of ETV1 approximately 50-fold (Supplementary Fig. 2a). MG- 
132 did not cause ETV1 accumulation in COP1-deficient PC3 cells 
(Supplementary Fig. 2b), consistent with ETV1 being less subject to 
proteasomal degradation in the absence of COP1. 


Mutation of the COP1 RING domain (C136A/C139A), which 
destroys its E3 ubiquitin ligase activity*”, or deletion of COP1 residues 
required for interaction with ligase component DET1 (COP1A24)° 
prevented overexpressed COP1 from decreasing endogenous ETV1 
in PC3 cells (Supplementary Fig. 2c, d). MG-132, Bortezomib, and 
DET1 knockdown in PC3 cells also abrogated ETV1 destabilization 
by COP1 (Supplementary Fig. 2e, f). ETV1 degradation by COP1 and 
DET1 was not limited to prostate cancer cells because co-expressed 
COP1 and DET1 reduced ETV1 in DET 1-deficient HCC1806 breast 
cancer cells (Supplementary Fig. 2g-i). The concerted action of COP1 
and DET1 seen in Arabidopsis thaliana’ therefore is conserved in the 
regulation of mammalian ETV1. 

Consistent with COP1 ubiquitin ligase activity being critical for ETV1 
degradation, ETV1 co-expressed with COP1 and DET1 in 293T cells 
was ubiquitinated (Fig. 1d) and the polyubiquitin chains contained 
degradative K48 linkages'' (Supplementary Fig. 2j). Conversely, COP1 
knockdown in LNCaP cells decreased ubiquitination of endogenous 
ETV1 (Supplementary Fig. 2k). We predict that ETV4 and ETV5 are 
regulated similarly because of their conserved COP1 binding motifs and 
reduced expression in PC3 cells upon coexpression with COP1 and 
DET1 (Fig. le). Our findings support previous observations that 
COP1 regulates the stability of ETV1, ETV4 and ETV5 in vitro’. Of 
note, the ETS transcription factor ERG, which lacks a COP1-binding 
motif, was not decreased by COP1 and DET] (Fig. le). 

Almost all reported ETV1 chromosomal rearrangements in human 
prostate tumours yield N-terminally truncated ETV1 (AETV 1), lacking 
the two N-terminal COP1 binding motifs (Supplementary Fig. 3a)'”. 
We proposed that these ETV1 mutants evade COP1-mediated degra- 
dation and this contributes to their overexpression in prostate cancers. 
Unlike wild-type ETV1, AETV1 did not co-immunoprecipitate with 
RING mutant COP1 (Supplementary Fig. 3b, c). Nor was AETV1 
polyubiquitinated and degraded by wild-type COP1 and DET1 (Sup- 
plementary Fig. 3d, e). To define the contribution of the three putative 
COP1-binding motifs or “degrons’ in ETV1, we mutated their con- 
served residues (VP to AA; Fig. 2a). Mutation of degron 2 alone, but 
not degrons 1 or 3, decreased the interaction of ETV1 with RING 
mutant COP1, and combining the degron 1 and 2 mutations compro- 
mised the interaction further (Supplementary Fig. 3c). The degron 
(1+2) mutant, like AETV1, was no longer subject to COP1- and 
DET1-mediated degradation (Fig. 2b). 

Unlike most oncogene products, ETV1 overexpressed in cultured 
prostate cells promotes invasive behaviour rather than cell prolifera- 
tion”*. COP1 silencing in LNCaP cells or overexpression in PC3 cells 
also had no effect on cell proliferation (data not shown). Instead, COP1 
expression reduced PC3 cell migration through a collagen-coated 
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Figure 1 | The COP1/DET1 ubiquitin ligase regulates ETV1 turnover. 

a, ETV1 expression in prostate cancer cell lines. Arrow, ETV1 protein; asterisk, 
bands of unknown identity. COP! and ETV1 mRNA expression is plotted 
relative to the BPH1 sample, which is assigned a value of 1. b, c, ETV1 protein in 
LNCaP cells treated for 2 h with dimethylsulphoxide (DMSO), 20 uM MG-132, 
or 2 uM bortezomib (b) or transfected with four independent COP1 siRNAs or 
anon-targeting (Ctrl) siRNA (c). Bars indicate the mean ETV1 or COPI mRNA 
level + s.d. of triplicate wells after normalization to RPLPO gene expression. 
d, HEK293T cells were transfected and treated with 10 uM MG-132 for 2h 
before lysis. Solid triangles indicate increasing plasmid DNA. COP1"”" 
contained mutated residues C136A/C139A. Immunoprecipitations (IP) used 
SDS- and heat-denatured lysates. GFP, green fluorescent protein; HA, 
haemagglutinin. e, Transfected PC3 cells. 


membrane, and this inhibition required the COP1 RING domain 
(Supplementary Fig. 3f). ETV1 degradation by COP1 likely contrib- 
uted to decreased PC3 cell invasion because COP 1 limited migration of 
cells coexpressing wild-type ETV1, but not the degron (1+2) ETV1 
mutant that escapes COP1-mediated degradation (Fig. 2c). 

Negative regulation of proto-oncogenic ETVs suggested a tumour 
suppressor role for COP1. In a renal graft model of prostate regenera- 
tion!*!°, Cop1*’*, Cop1*’” and Cop1 ‘~ prostate epithelial cells from 
gene-targeted mice (Supplementary Fig. 4) formed prostate structures 
at a similar frequency (Supplementary Table 2). Half of the Cop1~’~ 
structures, however, exhibited increased epithelial cell piling, loss of 
polarization, tortuous acini, and enhanced stromal cell invasion 
(Supplementary Fig. 5a, b). All structures expressed probasin, indi- 
cative of successful differentiation into prostatic secretory epithelium 
(Supplementary Fig. 5c), but COP1 loss was associated with increased 
expression of the luminal cell marker cytokeratin 18, as seen in pro- 
state malignancies (Supplementary Fig. 5d, f). Cop1’~ structures 
also showed a marked increase in Ki67-positive proliferating cells 
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Figure 2 | Truncated ETV1 encoded by prostate cancer translocations is not 
degraded by the COP1/DET1 ubiquitin ligase. a, Human ETV1 and truncated 
AETV1 encoded by TMPRSS2:ETV1. COP1 binding motifs are indicated as 
degrons 1, 2, and 3. b, Transfected HEK293T cells. ETV1(deg1 +2), ETV1(V63A/ 
P64A/V71A/P72A). ETV1 protein was quantified and normalized to B-actin. 
Bars indicate the mean + s.d. of four replicates. c, PC3 cells stably expressing 
doxycycline-inducible COP 1 were transfected with GFP plus empty vector, ETV1, 
or ETV1(deg1+2). Cells were tested for migration through a collagen-coated 
membrane after 40h. GFP" cells accessing the lower chamber were imaged and 
the GFP™ area (invading cells) plotted as percentage of cell mask. Bars show the 
mean + s.e.m. of four replicate wells. **P = 0.0025, ***P < 0.0001, t-test. 


(Supplementary Fig. 5e, g). These results indicate that COP1 deficiency 
causes aberrant prostatic epithelial cell growth. 

Next we deleted Cop1 in mouse prostatic epithelial cells in vivo with a 
probasin-Cre (Pb-Cre) transgene’’. Wild-type (Cop1*/* Pb-Cre*) and 
COP 1-deficient (Cop! Pb-Cre*) prostates were comparable up to 
24 weeks of age, but hyperplasia was evident in COP 1-deficient prostate 
lobes by 40 weeks (Supplementary Table 3). By 52 weeks of age, hyper- 
plasia was evident in all COP 1-deficient prostates, and two out of six had 
developed low-grade mouse prostatic intraepithelial neoplasia (MPIN)'” 
in the ventral lobe (Fig. 3a—c and Supplementary Fig. 5h). MPIN lesions 
were confined to the gland and were reminiscent of abnormalities 
observed in transgenic mice expressing the TMPRSS2:ETV1 gene pro- 
duct from the probasin promoter’. Cop1 gene deletion correlated with 
decreased COP1 protein expression, increased cell proliferation, and 
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increased ETV1, ETV4 and c-JUN (Supplementary Fig. 5i-p). These 
data indicate that COP1 suppresses prostate tumour development and 
regulates the abundance of ETV1, ETV4 and c-JUN in vivo, all of which 
have been linked to prostate cancer'*"*. 

Loss of the tumour suppressor PTEN is reported in ~50% of primary 
prostate cancers'®”°, and overexpression of ERG cooperates with PTEN 
loss in accelerating prostate cancer progression*’”’. We investigated 
whether COP1-deficiency in the prostate would cooperate similarly 
with PTEN loss and enhance tumour progression. By 30 weeks of 
age, prostates lacking COP1 and PTEN presented a more aggressive 
carcinoma when compared to prostates lacking only PTEN (Fig. 3d-f 
and Supplementary Table 3). Neoplastic epithelial cells in the Cop!” 
Pten™" Pb-Cre* prostate extended beyond the basement membrane, 
invading the stromal compartment and the adjacent muscle bundles. 
Many cells were poorly differentiated with marked cellular atypia and 
nuclear pleomorphism (Fig. 3d—f). Loss of COP1 again correlated with 
elevated ETV1, ETV4 and c-JUN (Fig. 3g). Our observations suggest 
cooperation between COP1 and PTEN in suppressing tumorigenesis. 
We speculate that stabilization of ETV1 and ETV4 in the absence of 
COP1 upregulates matrix metalloproteinases and enhances cell inva- 
sion®**, Knockdown of endogenous COP1 in LNCaP cells increased 
MMP1, MMP7 and MMP10 gene expression (Supplementary Fig. 6a— 
d), but this did not occur if both COP1 and ETV1 were knocked down. 
Indeed, microarray analyses showed that over 75% of the genes up- 
regulated in LNCaP cells after COP1 knockdown were ETV1-dependent 
(Supplementary Fig. 6e). Silencing of COP1, ETV1 and c-JUN returned 
92% of the upregulated genes to basal levels, indicating that ETV1 and 
c-JUN are both targets of COP1 suppression in LNCaP cells (Sup- 
plementary Fig. 6f, g and Supplementary Table 4). 

Next we determined whether COP1 loss correlated with elevated 
ETV1 protein expression in human prostate cancer. Analysis of 166 
comparative genomic hybridization (CGH) array data sets identified 
five cases with COP1 loss (Fig. 4a). We retrieved three of these cases 
and focal loss of COP1 protein correlated with elevated ETV1 protein 
(Fig. 4b, c and data not shown). Regions of normal COP!1 staining 
expressed minimal ETV1, supporting the inverse correlation between 
COP1 and ETV1 (Fig. 4d). Fluorescence in situ hybridization (FISH) 
in each case revealed loss of one copy of COP1 (Fig. 4e and data not 
shown), and in situ hybridization indicated silenced COP! mRNA 
expression (Supplementary Fig. 7a). ETV1 and ERG break-apart FISH 
excluded that ETV1 overexpression in these samples was translocation- 
related (Fig. 4e and data not shown). 

In parallel, we screened 120 human prostate cancer samples for 
elevated ETV1 protein expression by immunohistochemistry. Four 
cases exhibited focal overexpression of ETV1. One of these cases 
(HP2086) exhibited ETV1 translocation, elevated ETV1 mRNA and 
protein, and normal COP1 expression (Fig. 4e and data not shown). 
The remaining three cases lacked COP1 protein in areas staining 
strongly for ETV1 (Supplementary Fig. 7b-g) or c-JUN (Supplemen- 
tary Fig. 8). FISH was successful for two of these three samples and there 
was loss of one copy of COP1. No ETV1 or ERG translocations were 
detected (Supplementary Fig. 7h and data not shown). This scenario is 
probably analogous to what is observed with PTEN in prostate cancer: 
one PTEN allele is lost and the other is silenced”’. The mechanism(s) 
inactivating the remaining COP] allele is unknown. Regardless, these 


Figure 3 | COP1 deficiency in prostatic epithelium causes hyperplasia, early 
MPIN, and enhances the effects of PTEN loss. a—c, Haematoxylin and eosin- 
stained ventral Cop1*/* Pb-Cre* and Cop!" Pb-Cre* prostates aged 

52 weeks. Areas within dashed and solid boxes in a are magnified in b and 

c, respectively. Arrows indicate nuclear atypia. Scale bars: 100 lm (a), 25 um 
(b, c). d-f, Cop1 */* Dten!! Pb-Cre* and Cop" Pten™! Pb-Cre* prostates 
aged 30 weeks. Boxes in d are magnified in e and f, and contain characteristic 
MPIN (left panel) and invasive adenocarcinoma (right panel). Scale bars: 50 im 
(e), 25 um (f). g, ETV1 and ETV4 in Cop1*’* Pten!” Pb-Cre* and Cop" 
Pten™” Pb-Cre* anterior prostates aged 30 weeks. ETV1 and ETV4 detected 
with antibodies 13G11 and 5F8, respectively. 
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Figure 4 | Loss of COP1 and decreased ETV1 protein expression in human 
prostate adenocarcinomas. a, COP! DNA copy number in human prostate 
cancers. The dashed green line indicates the cutoff for copy number loss (logy 
ratio = —0.3). 5/166 tumours exhibit COP] loss. b-d, ETV1 and COP1 protein 
expression in COP1-deficient prostate adenocarcinoma case HP48535 detected 
by immunohistochemistry (IHC). Dashed or solid boxes in b are magnified in 
c and d, respectively. Scale bars: 100 j1m (b); 50 um (c, d). e, FISH analysis of 
HP48535 with COP1 (red) and CEP1 (green) probes (upper panels). Break- 
apart FISH assays for ETV1 translocations (lower panels). A yellow fusion 
signal (arrow) is normal, whereas discrete red (arrowhead) and green 
(arrowhead) signals indicate ETV1 translocation. 
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data support COP1 deficiency being a mechanism for stabilization of 
substrates such as ETV1 and c-JUN in human tumours. Even if human 
prostate cancer cells rarely lose COP1, they seem to have found an 
alternative means of evading negative regulation by COP1, namely loss 
of the COP1 binding/degron motifs from labile proto-oncogene pro- 
ducts such as ETV1 (Supplementary Fig. 9). This finding may represent 
a general paradigm for oncogene fixation in transformed cells because 
labile oncoproteins have the advantage if they lose the degron respons- 
ible for their instability. For example, c-MYC overexpression in 
Burkitt’s lymphoma is driven by translocation and mutation of residue 
T58, the latter being critical for its proteasome-mediated degradation”. 
Our work, by extension, indicates that all translocations involving labile 
oncogenes are likely to eliminate or mutate the degron(s) that normally 
confer physiological instability. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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Mice. Cop1 mutant mice were generated from C2 C57BL/6 embryonic stem (ES) cells 
electroporated with targeting constructs that: (1) replaced exon 1 coding sequence 
with a lacZ reporter gene or (2) contained exon 3 (encoding amino acids 159-190) 
flanked by loxP sites (Supplementary Fig. 4). Chimeras generated with exon 1 homo- 
logous recombinants were bred to C57BL/6-Gt(ROSA)26Sor"™ CPA" mice 
(TaconicArtemis GmbH) to delete the neomycin selection cassette. Exon 3 homo- 
logous recombinants were electroporated with a Cre recombinase expression con- 
struct to remove the loxP-flanked neomycin selection cassette. All ES clones and 
mice were confirmed by Southern blotting. pCAGG.Cre-ERmice were described 
previously”. Probasin-Cre4 transgenic mice’® were backcrossed to C57BL/6N for 
at least five generations. Mice with conditional Pten alleles’ were backcrossed to 
C57BL/6N for at least 10 generations. Pregnant SD rats were from Charles River 
Laboratories. Athymic nu/nu male mice (6-8 weeks old) were from Harlan 
Sprague Dawley. The Genentech Institutional Animal Care and Use Committee 
approved all protocols. 
Genotyping. Cop1 exon 1 primers 5’-GCTACCATTACCAGTTGGTCTGGT 
GTC-3', 5'’-CCAACCCCACAAGTTCAGGGAT-3’, and 5’-CTGCATCATGT 
TGTGTGATTGCAT-3’ yield 873 bp wild-type and 532 bp knockout DNA frag- 
ments. Cop1 exon 3 primers 5'-CATTGAAATGATAATTGCAGATTTGGTC-3’, 
5'-CACCACCCTGCCAGATCTTAAATATAGAT-3’, 5'-CAAACCTGTCACA 
AAATACTATTGTGCTCTC-3’ yield 686bp wild-type, 753bp floxed, and 
452 bp knockout DNA fragments. Pten primers 5’-TCCCAGAGTTCATACCA 
GGA-3', 5'-GCAATGGCCAGTACTAGTGAAC-3’, 5’-AATCTGTGCATGA 
AGGGAAC-3’ yield ~500 bp wild-type, ~650 bp floxed, and ~300 bp knockout 
DNA fragments. Cre-specific primers 5'-GCTAAACATGCTTCATCGTCG 
GTC-3’ and 5’-CCAGACCAGGCCAGGTATCTCTG-3’ amplified a 582 bp 
fragment. 
Plasmids and cell transfections. Mouse ETV1 (NP_031986) from a hypothal- 
amus cDNA library was subcloned into pCMVFlag6C (Sigma). Human ETV1 
(NP_004947) was cloned into pCDNA3.1 Myc-HisA (Invitrogen). ETV4 
(NP_001073143), ETV5 (NP_004445) and ERG (NP_001129626) were cloned 
into pcDNA3 (Invitrogen), which had been modified to contain a C-terminal 
haemagglutinin (HA) fusion tag. Flag-tagged human COP1, DET1 and TRAF3 
constructs were described previously*’. Wild-type and RING mutant COP1 were 
subcloned into a pHUSH-GW (Invitrogen) inducible vector system to generate 
stable PC3 clones. HEK293T and BMK cells were transfected with FUGENE 6 
(Roche). PC3 and HCC1806 cells were transfected with FUGENE HD (Roche). 
ON-TARGETplus siRNA oligonucleotides (Dharmacon) were transfected with 
Lipofectamine2000 and Lipofectamine RNAiMax (Invitrogen) into PC3 and 
LNCaP cells, respectively. Cell lysates were prepared 48h after plasmid DNA 
transfection and 72h after oligonucleotide transfection. 
Affinity purification of complexes containing COP1. Primary baby mouse 
kidney (BMK) cells were isolated from Copl”4! pCAGG.Cre-ER* mice and 
immortalized with E1A and dominant negative p53(A15-301)**. Subsequently, 
cells were grown for 72h in 100nM 4-hydroxytamoxifen (Sigma) to generate 
Cop14*" cells. PCR and western blotting confirmed deletion of the Cop1 con- 
ditional allele. COP 1-deficient BMK cells were transfected with Flag~COP1, Flag- 
TRAF3, or empty vector and 24h later were grown for 4h in 10 uM MG-132. Cells 
were lysed in ice-cold buffer (20 mM Tris pH 7.8, 92 mM NaCl, 9 mM MgCh, 0.1% 
Triton X-100, Complete protease inhibitor cocktail (Roche), 10 uM MG-132, 
phosphatase inhibitor cocktails 1 and 2 (Sigma), 10 U ml ! DNase 1). The soluble 
lysate was immunoprecipitated with Flag M2 agarose overnight at 4°C. Agarose 
beads were washed and bound proteins were eluted with 0.5mg ml’ 3X-Flag 
peptide (Sigma). 
Mass spectrometry. COP1-binding proteins concentrated by ultrafiltration 
(YM3, Microcon) were reduced and alkylated for SDS-PAGE. Resolved proteins 
were stained with Coomassie blue R-250 and subjected to in-gel trypsin diges- 
tion**. Liquid chromatography-tandem mass spectrometry (LC-MS/MS) analysis 
of the tryptic digests was performed on a hybrid linear ion trap Fourier transform 
ion cyclotron resonance mass spectrometer (LTQ-FT; Thermo Fisher) coupled to 
a nano-flow HPLC system (MDLC; Eksigent) in a vented column configuration”’. 
Mass spectral data were acquired using a data-dependent method comprised of 
one full MS scan (400-2,000 m/z) followed by product ion scans on the five most 
abundant ions detected. Mascot software (Matrix Science) was used to search the 
Swiss-Prot database. Results were displayed with Scaffold (Proteome Software), 
protein and peptide probability filters both set to 95%. 
Bioinformatics. The consensus COP1 binding motif [D/E](x)xxVP[D/E] was 
extracted from an alignment of Arabidopsis thaliana HY5, HYH, STO, STH; 
Mus musculus TRB3 and CRTC2; Homo sapiens c-JUN, JUNB, and JUND**?****. 
mRNA expression profiling was carried out on Affymetrix GeneChip Human 
Genome U133 Plus 2.0 Array following the manufacturer’s protocol (GEO acces- 
sion GSE27914). Expression summary values for all probe-sets were calculated 
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using the RMA algorithm as implemented in the affy package from Bioconductor. 
Probe sets 234950_s_at and 221911_at were chosen to assess COP] and ETVI1, 
respectively (Fig. 1a). Breast cancer cell lines HCC1806 and HCC1937 lacked 
expression of DET1 mRNA and had very low ETV1 mRNA (data not shown). 
Statistical analyses of differentially expressed genes were performed using linear 
models and empirical Bayes moderated statistics as implemented in the limma 
package from Bioconductor. 

DNA copy number data for human COP1 in prostate cancer samples were 

extracted from two public Affymetrix SNP array data sets (GEO accession 
GSE12702, n = 20; GSE19399, n = 87, including data downloaded from http:// 
www.broadinstitute.org/tumorscape/pages/portalHome.jsf), and one Agilent 
Human Genome CGH 244A dataset generated by Genentech (GSE20393, 
n= 59). All raw data were processed with the Genentech internal data analysis 
pipeline. For the Affymetrix SNP array data, array intensity signal CEL files were 
first processed by dChip using the PM/MM difference model and invariant set 
normalization, and normalized with data for normal samples from the Affymetrix 
website (http://www.affymetrix.com). Agilent CGH array data were first processed 
by Feature Extraction Software from Agilent. All processed copy numbers were 
then centred to a median of 2 and segmented. Copy number values for specific 
genes were calculated as the mean copy number value for the probe sets bounding 
the gene location and all intervening probe sets using the segmented data. 
siRNAs. COPI1 siRNA1 5'-CUACAAGGAUGUCUCGUAU-3’; COPI siRNA2 
5'-GCUAAUGUGUGCUGUGUUA-3’; COP1 siRNA3 5'-GAAUUGGUAUGA 
AGGGUUA-3'; COPI siRNA4 5'-CAUAAGAACCUGUUAGCUA-3’; ETV1 
siRNA1 5'-GAACAGCCCUUUAAAUUCA-3’; ETV1 siRNA2 5'-CAACGAAG 
GCUACGUGUAU-3’; ETV1 siRNA3 5'-UCUCCAAACUCAACUCAUA-3’; 
ETV1 siRNA4 5'-GAGAAAUUGUAACGAGAAA-3’; ETV4 siRNAI 5'-GGGC 
AGAGCAACGGAAUUU-3’; ETV4 siRNA2 5’-GAAUGGAGUUCAAGCUC 
AU-3'; ETV4 siRNA3 5'-GGACUUCGCCUACGACUCA-3’; ETV4 siRNA4 
5'-GAUGAAAGCCGGAUACUUG-3’; ETV5 siRNA1 5’-CCGAAGGCUUUGC 
UUACUA-3'; ETV5 siRNA2 5’-CGGCAAAUGUCAGAACCUA-3’; ETV5 
siRNA3 5’-GAGAUAAUCGCCCCAGUUA-3’; ETV5 siRNA4 5'-GGAAAUCU 
CGAUCUGAGGA-3'; DET1 siRNA1 5'-GUAGUAACACUGCGAGUCA-3’; 
DETI1 siRNA3 5'-CAAGUACACUAGUGAGGAU-3’; c-JUN siRNA6 5’-GAA 
CAGGUGGCACAGCUUA-3’; c-JUN siRNA7 5'-GAAACGACCUUCUAUGA 
CG-3'; Non-targeting siRNAs were from Dharmacon (catalogue D-001810-01-20; 
D-001810-02-20; D-001810-10-05; D-001810-01-05). 
Immunoprecipitations and western blotting. Cells were lysed in 20 mM HEPES 
pH7.2, 2mM EGTA, 5mM EDTA, 30mM NaF, 60mM §-glycerophosphate, 
20mM sodium pyrophosphate, 1 mM Na3;VO,, 1% Triton X-100. Soluble lysate 
was immunoprecipitated with Flag M2 agarose or with the indicated antibodies 
coupled to Protein A/G agarose beads (Pierce Biotechnology). Beads were washed 
extensively in lysis buffer containing 0.5 M NaCl and then once more in straight 
lysis buffer before elution in LDS-sample buffer (Invitrogen) containing 1% 
2-mercaptoethanol. To detect ubiquitinated MYC-ETV1, soluble lysate was sup- 
plemented with 1% SDS and heated at 95°C for 10 min. Denatured lysate was 
diluted 20-fold in lysis buffer and then immunoprecipitated with 9E10 MYC 
agarose (Clontech). 

Antibodies used for immunoblotting recognized f-actin (Novus Biologicals, 
NB600-501), c-JUN (Epitomics, 1254-1), COP1 (Genentech, 28A4), DET1 
(Genentech, 3G5), ETV1 (Abcam, Ab36788), ETV1 (Genentech rat monoclonal 
13G11), ETV4 (Genentech rabbit polyclonal Y771, which recognizes amino acids 
2-199 of mouse ET V4), ETV4 (Genentech rat monoclonal 5F8), Flag (Sigma, M2), 
GAPDH (Cell Signal, 14C10), GFP (Invitrogen, A11122), HA (Sigma HA-7), K48- 
or K63-linked polyubiquitin (Genentech Apu2.07 and Apu3.A8)”, and MYC 
(Genetex, GTX21261). Unless indicated, ETV1 was blotted with Ab36788 
throughout. 

Antibodies used for immunoprecipitation detected COP1 (28A4) and ETV1 
(Genentech rabbit polyclonal antibodies Y713 and Y714, which recognize amino 
acids 268-477 of mouse ETV1). Quantification of western blot signals was per- 
formed on a Typhoon scanner (GE Healthcare) following chemiluminescence 
detection with ECLplus (GE Healthcare). 

Cell invasion. PC3 clonal cell lines stably expressing doxycycline-inducible COP1 
variants were grown overnight in medium lacking fetal bovine serum (FBS) and in 
the presence of 0.03 jtgml' of doxycycline or PBS vehicle control. In the morning, 
cells were trypsinized, resuspended in the same medium, counted, and then 
assayed for their invasiveness. Cells (10°) were seeded in quadruplicate in 24-well 
transwell plates (8 jm pores, Fluoroblok PET membranes (BD Falcon)) that were 
previously coated with rat tail collagen type 1. Medium containing 10% FBS was 
added to lower compartment. After 23-24h live cells were stained with calcein 
(1 pg ml! for 1h), imaged and quantified on an ImageXpress microscope device 
(Molecular Devices). In Fig. 2h-j, cells were cotransfected with GFP expressing 
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vector and the desired expression vectors 24h before serum deprivation. GFP- 
expressing cells that migrated into the lower chamber were imaged. 
Quantitative reverse transcriptase polymerase chain reaction (qRT-PCR). 
Total cellular RNA was prepared with a RNeasy Plus kit (Qiagen) and subject 
to DNase I digestion. RT-PCR reactions were performed in 384-well plates, with 
10 to 40ng of RNA per reaction on a 7900HT Fast Real-Time PCR System 
(Applied Biosystems). Taqman gene expression assays (Applied Biosystems): 
COP1, Hs00375437_m1; DETI, Hs00894490_m1; ETV1, Hs00231877_m1; 
ETV4, Hs00385910_m1; ETV5, Hs00231790_m1; MMP1, Hs00899658_m1; 
MMP7, Hs01042796_m1, RPLPO, 4326314E. Control reactions lacked reverse 
transcriptase and had C, values at least 3 units higher than reactions performed 
with reverse transcriptase. 

MMP1 and MMP7 protein quantification. Forty eight hours following transfec- 
tion, LNCaP cells were grown overnight in serum-free medium. Total MMP1 and 
MMP7 proteins in conditioned medium were quantified by ELISA with SensoLyte 
MMPI (AnaSpec) and Quantikine MMP7 (R&D Systems). 

Prostate regeneration model. Rat UGM stromal cell isolation’’ and the prostate 
regeneration assay were described previously'*”°. Prostate from 8 to 10 weeks old 
mice were dissociated and live cells enriched with a Dead Cell Removal Kit 
(Miltenyi Biotec). Prostate cells (100,000 cells per graft) were mixed with UGM 
stromal cells (250,000 cells per graft) in 3 mg ml ' collagen type I (20 ll per graft), 
incubated at 37 °C for 1h to allow collagen gelation, and overlaid with prostate 
culture medium (DMEM supplemented with 10% FBS, 2mM glutamine, 10 pg 
ml? insulin, 5.5 ug ml! transferrin, 6.7 ng ml! selenium, 1nM testosterone 
(Innovative Research of America), 100 U ml penicillin and 100 mg ml’ strep- 
tomycin). Gels were incubated overnight at 37 °C with 10” plaque forming units 
per ml of Ad5-CMV-Cre-GFP virus (Baylor College of Medicine), washed in 
culture medium, and grafted under the renal capsule of 6-8 weeks old athymic 
nu/nu mice together with a subcutaneous 90-day slow-release testosterone pellet 
(12.5 mg per pellet per mouse; Innovative Research of America). Grafts were 
harvested 12 weeks after implantation. 

Immunohistochemistry. Immunohistochemical analyses of regenerated prostate 
structures were performed as described’* with antibodies to CK18 (Abcam, C-04), 
Ki6é7 (BD Biosciences, clone B56), CK14 (Covance, AF64), 1-integrin 
(Chemicon, clone MB1.2), and probasin (Santa Cruz Biotechnology, M-18). 
Percentages of positive cells were calculated by assessing at least 1,000 cells per 
genotype stained for CK18, 900 cells for Ki67, and 300 cells for CK14. 

Formalin-fixed paraffin-embedded mouse prostates were cut at 41m, pre- 
treated with Target Antigen Retrieval buffer (DAKO) followed by KPL peroxidase 
blocking solution (Kirkegaard and Perry Laboratories) and avidin/biotin blocking 
kit (Vector Labs). TNB blocking buffer (Perkin Elmer) or 10% goat serum/3% 
BSA/PBS was used before incubation with either 1 pgml * rat ETV1 antibody 
(Genentech, Clone 1H2), 0.125 ng ml! hamster COP1 antibody (Genentech, 
Clone 1D10), 0.3 ug ml | rabbit JUN antibody (Epitomics, Clone E254) or at 
1:200 with rabbit Ki67 antibody (Thermo Scientific, Clone SP6) for 60 min at 
room temperature. Species-appropriate biotinylated secondary antibody (Vector 
Labs), followed by a streptavidin-HRP reagent from TSA kit (for ETV1 and COP1 
IHC; Perkin Elmer) or ABC Elite HRP Reagents (for c-JUN and Ki67 IHC; Vector 
Labs) was applied. Biotinylated TSA amplification reagents (Perkin Elmer) were 
used to visualize ETV1 and COP1 immunostaining. Sections were treated with 
metal enhanced DAB colorimetric peroxidase substrate (Thermo Scientific), then 
Myer’s haematoxylin (Rowley Biochemical Institute) counterstain. 

Rabbit ETV4 antibody (Lifespan Biosciences) staining of mouse prostate was 
performed on the Ventana Discovery XT Platform at 5 pg ml * with no antigen 
retrieval using the anti-murine-OMNIMAP-HRP Kit and Ventana DAB colori- 
metric reagents. Sections were counterstained with Ventana Hematoxylin II 
reagent. Experiments to validate the 1H2 rat anti-ETV1, Ab36788 rabbit anti- 
ETV1 and the Lifespan anti-ETV4 antibodies are presented (Supplementary 
Fig. le, f and Supplementary Fig. 10). 


In situ hybridization (ISH). Radioactive in situ hybridization was performed as 
previously described*'. Probes were prepared by PCR with the following primers: 
human COPI1 (5'-GGGCTCATCAACTCCTACGA-3'; 5’-GAGAACTGCCAC 
TGAAACCTG-3’); human ETV1 (5'-GAATCTTTGTTTTATTTCTGTTGT-3’; 
5'-CAGAGTCCAAAATTGTGCCCCTC-3’). The slides were exposed for 
5 weeks, developed, and counterstained with haematoxylin and eosin. 
Fluorescence in situ hybridization (FISH). A bacterial artificial chromosome 
(BAC) contig comprising three overlapping clones, RP11-102E20, CTD-3127]24, 
and RP11-415M4 was used as a COPI probe. A CEPI probe (Vysis/Abbott 
Laboratories) was also used. FISH probes for identifying ERG and ETV1 translo- 
cations were provided by A. M. Chinnaiyan and used as described*’. BAC clone 
DNA was extracted by standard methods* and directly labelled with Spectrum 
Orange by nick translation (Vysis/Abbott Laboratories). FISH to normal human 
metaphases (Abbott Laboratories) confirmed the genomic location of the BAC 
clones. FISH on cytogenetic preparations and formalin-fixed paraffin-embedded 
tissue was performed as described**. COP1 copy number was evaluated by count- 
ing spots in a range from 50 to 100 non-overlapped, intact interphase nuclei per 
tumour tissue core. 4’,6-diamidino-2-phenylindole, dihydrochloride staining of 
nuclei with reference to the corresponding haematoxylin- and eosin-stained tissue 
identified the areas of adenocarcinoma. Based on hybridization in control normal 
cells (data not shown), hemizygous deletion of COP1 was defined as >30% 
(mean + 3s.d. in non-neoplastic cells) of tumour nuclei containing one COP1 
locus signal and by the presence of CEP1 signals. 


30. Hayashi, S. & McMahon, A. P. Efficient recombination in diverse tissues by a 
tamoxifen-inducible form of Cre: a tool for temporally regulated gene activation/ 
inactivation in the mouse. Dev. Biol. 244, 305-318 (2002). 

31. Wang, S. et al. Prostate-specific deletion of the murine Pten tumor suppressor 
gene leads to metastatic prostate cancer. Cancer Cell 4, 209-221 (2003). 

32. Kayagaki, N. et al. DUBA: a deubiquitinase that regulates type | interferon 
production. Science 318, 1628-1632 (2007). 

33. Shaulian, E., Zauberman, A, Ginsberg, D. & Oren, M. Identification of a minimal 
transforming domain of p53: negative dominance through abrogation of 
sequence-specific DNA binding. Mol. Cell. Biol. 12, 5581-5592 (1992). 

34. Shevchenko, A., Tomas, H., Havlis, J., Olsen, J. V. & Mann, M. In-gel digestion for 
mass spectrometric characterization of proteins and proteomes. Nature Protocols 
1, 2856-2860 (2007). 

35. Yi, E.C., Lee, H., Aebersold, R. & Goodlett, D. R. A microcapillary trap cartridge- 
microcapillary high-performance liquid chromatography electrospray ionization 
emitter device capable of peptide tandem mass spectrometry at the attomole level 
on an ion trap mass spectrometer with automated routine operation. Rapid 
Commun. Mass Spectrom. 17, 2093-2098 (2003). 

36. Holm, M., Hardtke, C.S., Gaudet, R. & Deng, X. W. Identification of a structural motif 
that confers specific interaction with the WD40 repeat domain of Arabidopsis 
COP1. EMBO J. 20, 118-127 (2001). 

37. Ang, L.H. etal Molecular interaction between COP1 and HY5 defines a regulatory 
switch for light control of Arabidopsis development. Mol. Cell 1, 213-222 (1998). 

38. Dentin, R. et al. Insulin modulates gluconeogenesis by inhibition of the coactivator 
TORC2. Nature 449, 366-369 (2007). 

39. Newton, K. eta/. Ubiquitin chain editing revealed by polyubiquitin linkage-specific 
antibodies. Cell 134, 668-678 (2008). 

40. Cunha, G. R. & Lung, B. The possible influence of temporal factors in androgenic 
responsiveness of urogenital tissue recombinants from wild-type and androgen- 
insensitive (Tfm) mice. J. Exp. Zool. 205, 181-193 (1978). 

41. Jubb, A.M., Pham, T. Q., Frantz, G. D., Peale, F. V. Jr & Hillan, K. J. Quantitative in situ 
hybridization of tissue microarrays. Methods Mol. Biol. 326, 255-264 (2006). 

42. Mehra, R. et al. Comprehensive assessment of TMPRSS2 and ETS family gene 
aberrations in clinically localized prostate cancer. Mod. Pathol. 20, 538-544 
(2007). 

43. O’Brien, C. et al. Functional genomics identifies ABCC3 as a mediator of taxane 
resistance in HER2-amplified breast cancer. Cancer Res. 68, 5380-5389 (2008). 

44. Pandita, A., Aldape, K. D., Zadeh, G., Guha, A. & James, C. D. Contrasting in vivo and 
in vitro fates of glioblastoma cell subpopulations with amplified EGFR. Genes 
Chromosom. Cancer 39, 29-36 (2004). 


©2011 Macmillan Publishers Limited. All rights reserved 


Le ER 


doi:10.1038/nature10006 


Reprogramming transcription by distinct classes of 
enhancers functionally defined by eRNA 


Dong Wang"*, Ivan Garcia-Bassets>**, Chris Benner", Wenbo Li”, Xue Su4, Yiming Zhou”, Jinsong Qiu', Wen Liu’, 
Minna U. Kaikkonen!, Kenneth A. Ohgi*, Christopher K. Glass', Michael G. Rosenfeld? & Xiang-Dong Fu! 


Mammalian genomes are populated with thousands of transcrip- 
tional enhancers that orchestrate cell-type-specific gene expression 
programs’ *, but how those enhancers are exploited to institute 
alternative, signal-dependent transcriptional responses remains 
poorly understood. Here we present evidence that cell-lineage- 
specific factors, such as FoxA1, can simultaneously facilitate and 
restrict key regulated transcription factors, exemplified by the 
androgen receptor (AR), to act on structurally and functionally 
distinct classes of enhancer. Consequently, FoxA1 downregula- 
tion, an unfavourable prognostic sign in certain advanced prostate 
tumours, triggers dramatic reprogramming of the hormonal res- 
ponse by causing a massive switch in AR binding to a distinct 
cohort of pre-established enhancers. These enhancers are func- 
tional, as evidenced by the production of enhancer-templated 
non-coding RNA (eRNA’°) based on global nuclear run-on sequen- 
cing (GRO-seq) analysis®, with a unique class apparently requiring 
no nucleosome remodelling to induce specific enhancer-promoter 
looping and gene activation. GRO-seq data also suggest that 
liganded AR induces both transcription initiation and elongation. 
Together, these findings reveal a large repository of active enhancers 
that can be dynamically tuned to elicit alternative gene expression 
programs, which may underlie many sequential gene expression 
events in development, cell differentiation and disease progression. 

The wide diversity of mammalian cells is determined by a large 
repertoire of constitutive and inducible genes, which are regulated by 
general and cell-type-specific transcription factors and cofactors 
through regulatory genomic elements”*. Recent studies reveal that gene 
promoters are marked by tri-methylated H3K4 (H3K4me3) and distal 
regulatory elements are often associated with mono-methylated H3K4 
(H3K4mel1)'”. Because these H3K4mel-positive, H3K4me3-negative 
regions exhibit striking cell-type specificity'”, we used this signature to 
characterize potential enhancers in prostatic LNCaP cells in which one 
of key regulatory transcriptional programs is mediated by the AR. We 
identified by chromatin immunoprecipitation (ChIP)-sequencing 
14,283 H3K4me3-marked and 51,544 H3K4mel-marked loci in 
androgen (5a-dihydrotestosterone, (DHT))-treated LNCaP cells, 
among which 43,565 loci are uniquely marked by H3K4mel, largely 
localized distal to annotated transcriptional start sites (TSSs) (94%), 
and associated with other marks linked to enhancer activities (Fig. 1a). 

De novo DNA motif analysis revealed several highly enriched 
motifs, particularly the forkhead motif (Fig. 1b). Using a specific 
antibody against FoxA1, a major FOX family member expressed in 
LNCaP cells and normal prostate gland’ (Supplementary Fig. 1), we 
identified 33,426 FoxAl-bound sites, which extensively overlap with 
distal H3K4mel-marked regions (Fig. 1c and Supplementary Fig. 2a; 
see on KLK3 enhancer’ in Supplementary Fig. 2b). RNA profiling 
supports the functional relevance of these FoxAl/H3K4mel loci, 
as genes responsive to FOXAI short interfering RNA (siRNA) are 


located more proximally to FoxAl/H3K4mel-marked loci than 
non-responsive genes (Fig. 1d and Supplementary Fig. 3). 

FoxA1 has been characterized as a ‘pioneer’ factor to facilitate DNA 
binding by other sequence-specific transcription factors”’* '* and ‘trans- 
late H3K4mel/me2 into AR-mediated gene expression’. Comparing 
the profile of H3K4mel and H3K27ac before and after FOXAI knock- 
down, we detected three classes of FoxA1-binding sites based on the 
H3K4mel signal exhibiting reduced (~22%), relatively unaffected 
(~74%) or even increased (~3.4%) levels over candidate enhancers 
(Fig. le-g and Supplementary Fig. 4). RNA profiling analysis agrees with 
the functional significance of these selective FoxA1 effects, revealing 
more downregulated genes in the first class, roughly equal numbers of 
up- or downregulated genes in the second and more upregulated genes 
in the third (Fig. 1h), suggesting a contribution of FoxA1 to ‘writing’ and 
‘reading’ the ‘histone code’ on different enhancer cohorts, in line with its 
critical function in prostate gland development’®". 

The rationale for our experimental strategy to use RNA interference 
(RNAi) to study FoxA1-regulated enhancer network is the association 
of decreased FOXAI expression with castration-resistant, poor pro- 
gnostic prostate tumours (Supplementary Fig. 5). In LNCaP cells, 
FOXA1 RNAienhanced cell entrance to S phase with reduced hormone 
(Fig. 2a). To understand the mechanistic basis for elevated hormone 
responsiveness, we mapped AR-binding sites, identifying 3,115 high 
confident loci with approximately 65% co-incident with H3K4mel. 
De novo motif analysis revealed highly enriched elements for both 
AR and FoxA1, including a composite motif consisting of a FOX motif 
and AR regulatory element (ARE) half site, suggesting ternary complex 
formation on these sites (Fig. 2b). Indeed, 1,684 AR-bound loci (54% of 
total) are co-occupied by FoxAl in DHT-treated LNCaP cells and 
FoxA1 appears to bind to most of these sites (~70%) before hormone 
treatment (Supplementary Fig. 6). 

The conundrum is that, although FoxA1 is known to facilitate AR 
binding on several DHT-responsive genes’, FOXA1 RNAi actually 
markedly elevated, rather than diminished, the DHT response 
(Fig. 2a). We found that approximately 60% of the original AR binding 
events were ‘expectedly’ lost in response to FOXA1 RNAi, which we 
refer to as the ‘lost’ AR program (Fig. 2c, d). We refer to the remaining 
approximate 40% of AR binding events as the ‘conserved’ AR program, 
which often exhibited enhanced AR binding. Strikingly, we detected a 
massive gain of 10,869 new AR binding loci, referred to as the ‘gained’ 
AR program (Fig. 2c, d). We extensively validated each of these AR 
programs by conventional ChIP-quantitative PCR (qPCR) (Fig. 2e). 
This induced AR reprogramming appears to be qualitatively and 
quantitatively distinct from reported AR re-targeting on androgen- 
resistant LNCaP-abl cells compared with parental LNCaP cells’? and 
is in sharp contrast to FoxAl-dependent genomic targeting of the 
oestrogen receptor-o (ER-c) in breast cancer MCF7 cells'®. In concert 
with such massive AR reprogramming, we observed corresponding 
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Figure 1 | FoxA1 contributes to the enhancer code in prostate cancer cells. 
a, Distribution of histone marks within +2-kb windows around distinct 
genomic regions (nm = 43,565) marked by H3K4mel, but not H3K4me3, in 
androgen (DHT)-stimulated LNCaP cells. The ChIP-seq data sets for 
H3K4mel, H3K4me2, H3K4me3, H3K27ac, H4K5ac and p300 were each 
aligned with respect to the centre of the H3K4mel1 signal and sorted by the 
length of H3K4mel-marked regions. b, Top-enriched DNA motifs with 
significant P values and prospective families of DNA binding transcription 
factors identified by de novo motif analysis of non-promoter regions marked by 
H3K4mel. c, Percentage of H3K4mel-marked regions that show FoxA1 
binding events (top panel) and percentage of FoxA1-binding sites that are 
marked by H3K4mel (bottom panel). Note that H3K4mel-marked regions 
tend to be broad, but FoxA1-binding sites are discrete; as a result, many 
H3K4mel-positive regions may contain more than one FoxA1-binding site. 
d, Genomic distance from FoxA1/H3K4mel -positive loci to the nearest TSS of 
genes in response to FOXA1 knockdown. Outliers were omitted from box plots. 
P values indicate the significance in pair-wise comparisons. e-g, Three classes 
of FoxA1/H3K4mel-positive loci according to the response in levels of 
H3K4mel to FOXA1 knockdown: greater than 1.5-fold decrease (e), no 
significant change (f) and greater than 1.5-fold increase (g). h, Ratio (log,) of 
up- and downregulated genes in each H3K4mel-responsive category in 

e-g. CTL, control. 


changes in gene expression in each of three AR programs (Fig. 2f, g and 
Supplementary Fig. 7). The newly induced AR expression program is 
also linked to AR binding events (Fig. 2h), suggesting a direct gain-of- 
function on DHT-responsive genes, as illustrated on SOX9 and other 
genes (Supplementary Fig. 8), which have been previously documen- 
ted to play critical roles in cancer progression’*”*. Because we also 
observed an approximate threefold elevation of AR expression in 
FOXA1 RNAi-treated cells (Supplementary Fig. 9a), we tested the 
possibility that increased AR expression might trigger these effects. 
We found that AR overexpression alone was insufficient to induce 
AR reprogramming (Supplementary Fig. 9b). 

To explore the mechanism for AR reprogramming, we determined 
FoxA1 binding on different AR programs. We found that the gained 
AR program is largely devoid of FoxA1, whereas FoxA1 is present in 
more than half of the lost and conserved AR programs (Supplementary 
Fig. 10). This raises the possibility that FoxA1 may facilitate AR bind- 
ing to its original binding program, but trans-repress AR from binding 
to other genomic regions that lack FoxA1-binding sites in the gained 
program, a strategy frequently used by other transcription activators”'. 
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DHT-treated LNCaP cells. d, Quantitative levels of 
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Indeed, as previously reported’, FoxAl overexpression squelched 
ARE-driven transcription in transfected HEK293 cells (Supplemen- 
tary Fig. 11), which is consistent with the ability of AR to interact with 
FoxA1 directly”. This mechanism appears to be exploited during 
tumour progression because an AR mutation identified in advanced 
prostate tumours lacks part of the hinge domain important for inter- 
actions with FoxA1, its ability to interact with FoxAl, and became 
resistant to FoxAl-mediated trans-repression (Supplementary Fig. 
11b, c). Furthermore, our functional analysis indicates that the missing 
AR ligand-binding domain also contributes to AR:FoxA1 interactions 
(Supplementary Fig. 12). Interestingly, similar AR truncations have 
also been reported to result from alternative splicing, gene rearrange- 
ment and/or calpain-mediated cleavage (Supplementary Fig. 13). 
Based on these findings, we propose that FoxA1 regulates AR genomic 
targeting by simultaneously anchoring AR to cognate loci and restrict- 
ing AR from other ARE-containing loci in the human genome. 

To understand how reprogrammed AR binding is translated to 
altered hormonal response, we took advantage of the recently estab- 
lished GRO-seq® to detect the functional relationship between AR 
binding and hormone-induced gene expression. This powerful 
genome-wide interrogation of ongoing transcription detected a broad 
scope of nascent RNAs. We uncovered 28,318 transcripts with 15,656 
annotated and 12,662 unannotated transcripts, among which 450 cod- 
ing and 347 unannotated transcripts were induced more than 1.5-fold 
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with even just 1h DHT treatment (Supplementary Fig. 14). The TSSs 
of GRO-seq defined transcripts are typically marked by H3K4me3 and 
H3K27ac (Supplementary Fig. 15a, b). Importantly, GRO-seq also 
detected non-coding RNAs from a subset of H3K4mel-positive, 
H3K4me3-negative regions (Supplementary Fig. 15c). As illustrated 
on the enhancer of the KLK3 transcription unit (Fig. 3a), these eRNAs 
are largely symmetrical and bidirectional (see additional examples on 
other well-known hormone regulated genes, such as PMEPAI and 
KLK2 in Supplementary Fig. 16). Interestingly, we often detected a 
large amount of nascent RNA before DHT treatment, particularly near 
their TSSs (for example, KLK3); DHT not only enhanced the expres- 
sion of these nascent RNAs, but also allowed the extension of tran- 
scription towards the end of the gene (Fig. 3a and Supplementary Fig. 
16). We estimated that approximately 79% of the transcription units 
induced by liganded AR are regulated at the level of transcriptional 
initiation, whereas approximately 21% appear to be primarily regu- 
lated at the level of elongation (Supplementary Fig. 17). 

The ability to detect regulated eRNA expression allowed us to ana- 
lyse different AR programs during transcriptional reprogramming. In 
the presence of FoxAl, DHT enhanced eRNA expression from AR- 
bound enhancers in both the lost and conserved AR programs. In 
contrast, a basal level of eRNAs was detectable on the gained program, 
but was independent of the hormone treatment, indicating that these 
are pre-established enhancers (Fig. 3b). In response to FOXA1 RNAi, 
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Figure 3 | Transcriptional response on individual enhancer programs to 
FOXAI downregulation. a, Display of nascent RNA detected by GRO-seq on 
the KLK3 locus. The DHT-induced AR binding is shown at bottom as a 
reference. b, c, Induction of eRNA by DHT (b) or FOXA1 knockdown in DHT- 
treated LNCaP cells (c). The eRNA levels under different conditions (indicated 
at bottom) are separately displayed on three AR-binding programs. d, e, Effects 
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of FoxA1 on binding of p300 (d) and Med12 (e) in each AR program in DHT- 
treated LNCaP cells. f, g, Long-distance interaction between gene promoter and 
AR-bound site was determined by the 3C assay on two representative gene loci 
selected from the conserved and gained AR programs. Negative controls at 
shorter distances and a positive control with the corresponding BAC in the 
region are included in each case. 
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the expression of eRNAs was diminished from the lost program, but 
modestly or dramatically enhanced from the conserved and gained 
programs, respectively (Fig. 3c). The DHT-induced nascent tran- 
scripts (detected by GRO-seq) and steady-state RNAs (detected by 
microarrays) best predict direct target genes by liganded AR, as they 
show the shortest distance (<50 kilobases (kb)) to nearby AR-binding 
sites compared with genes identified by either criterion alone (Sup- 
plementary Fig. 18), indicating that AR-activated enhancers marked 
by increased eRNA are responsible for activation of nearby coding 
transcription units. 

In concert with differential eRNA expression, we also observed 
corresponding changes in levels of another mark in the final step of 
enhancer activation‘, specifically p300, on both conserved and gained 
AR programs (Fig. 3d). Interestingly, enhancers in the lost AR pro- 
gram continued to exhibit significant p300 binding, even after AR 
binding and eRNA expression were diminished in FOXA1 knockdown 
cells (Fig. 3c, d). The transcription mediator Med12 has recently been 
suggested to mediate enhancer-promoter looping’. We tested Med12 
binding on individual AR programs, finding that it exhibited an ident- 
ical binding pattern to p300 (Fig. 3e). Enhanced Med 12 binding on the 
conserved and gained programs after FOXAI knockdown suggests 
elevated or newly activated enhancer-promoter interactions. This 
was demonstrated by the 3C assay on two representative genes where 
FOXA1 knockdown either enhanced (on the FASN locus from the 
conserved AR program) or create new (on the NDRG1 locus in the 
gained AR program) long-range interactions between AR-bound 
enhancers and specific gene promoters in DHT-treated cells (Fig. 3f, 


g and Supplementary Fig. 19). These data strongly suggest that the 
induction of eRNAs, rather than binding of either p300 or Med12, is 
the most precise mark of the final, functional looping between an 
activated enhancer and its regulated gene promoter. 

Addressing the structural basis for different functional classes of AR 
enhancers, we note that the distinct profiles of H3K4mel and H3K27ac 
on the lost, conserved and gained AR programs and FOXA1 RNAi had 
little effect on these profiles (Fig. 4a, b and Supplementary Fig. 20). The 
histone marks H3K4mel and H3K27ac around the lost and conserved 
AR programs exhibit a bimodal distribution, which is particularly pro- 
nounced on the lost program (Fig. 4a, bottom panel). The DNA- 
binding sites in the lost AR program are actually significantly less 
enriched in canonical AREs, which may render AR binding on these 
sites particularly dependent on FoxA1, whereas both the conserved and 
gained AR programs are associated with nearly perfect palindromic, 
canonical AREs (Supplementary Fig. 21), explaining why AR is able to 
target those sites in a FoxAl-independent manner. Strikingly, the 
gained AR-binding sites are coincident with sharp H3K4mel and 
H3K27ac peaks (Fig. 4a, b, middle panels), suggesting a distinct nucleo- 
some architecture underlying the gained AR program. 

A recent study has suggested that AR binding leads to dynamic 
dismissal of a central, H2A.Z-containing nucleosome, being replaced 
by two flanking H3K4me2-marked nucleosomes”’. We found that the 
lost AR program was largely devoid of a ‘central’ nucleosome even 
before AR binding (Fig. 4c, bottom panel). The conserved AR program 
exhibited DHT-induced switch from the central H3K4me2-marked 
nucleosome to two flanking H3K4me2-marked nucleosomes, which 
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Figure 4 | Distinct classes of AR enhancers in the human genome. 

a, b, Profiles of H3K4mel1 (a) and H3K27ac (b) associated with the lost (bottom 
panels), conserved (top panels) and gained (middle panels) AR programs in 
DHT-treated LNCaP cells in response to FOXA1 knockdown. ¢, d, Profiles of 
H3K4me2 around AR binding loci at the nucleosomal resolution in response to 
DHT stimulation in control siRNA-treated (c) or FOXA1 siRNA-treated 

(d) LNCaP cells. e, Profiles of the histone variant H2A.Z on the three different 
AR programs. f, Model for FoxA1-mediated AR targeting and reprogramming 
in LNCaP cells. In class I (the lost AR program), FoxA 1 licenses liganded AR to 
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bind to ARE in relatively nucleosome-free regions. AR binding does not induce 
nucleosome remodelling in this class of enhancers. In class II (the conserved AR 
program), AR binds independently of FoxA1 to ARE, inducing nucleosome 
remodelling. In class II (the gained AR program), FoxA1 restricts AR binding, 
despite the presence of strong AREs. Although pre-established, these gained 
loci exhibit a strong central nucleosomes and are associated with H2A.Z, which 
is not affected by AR binding. FOXA1 knockdown converted these sites to 
androgen-responsive sites. In all these three classes, RNAs were generated or 
increased after AR binding. e:p, enhancer:promoter. 
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is largely independent of FoxA1 (Fig. 4c, d, top panels). The gained 
program showed a strong H3K4me2-marked central nucleosome both 
before and after AR binding (Fig. 4c, d, middle panel). Thus, this 
gained AR program represents a new type of enhancer topography 
that requires no nucleosome remodelling for enhancer recognition 
and subsequent enhancer-promoter interactions. H2A.Z is preva- 
lently associated with the gained AR program, modestly with the con- 
served AR program and absent in the lost AR program (Fig. 4e). 
Together, these findings establish distinct chromatin structures under- 
lying functionally distinct classes of AR enhancer. 

In summary, our findings imply a general principle for establishing 
cell-type-specific transcription programs. Cell-lineage-specific factors 
(such as FoxA1) coupled with other general transcriptional factors 
‘create’ a cell-type-specific enhancer network, allowing other regulated 
factors (such as AR) to ‘activate’ these pre-established enhancers 
(Fig. 4f). The enhancer activation process is tightly linked to eRNA 
production, which appear to serve as a more robust indicator of 
enhancer activities than any enhancer-bound transcription activators 
or chromatin marks. On the current biology model, AR reprogram- 
ming dramatically altered the androgen-responsive pathway, which, 
according to GO analysis (Supplementary Fig. 22 and Fig. 23), may 
contribute to enhanced cell growth and the establishment of an 
appropriate microenvironment in advanced prostate cancer*®”*. 
Together, these findings provide a conceptual framework to under- 
stand complex gene-expression switching events, as occurs during 
disease progression and development. 


METHODS SUMMARY 


Experiments were performed on LNCaP cells, LNCaP-AR cells (gift of C. Sawyers) 
and HEK293 cells. ChIPs were done as previously described’? and GRO was per- 
formed as described**®. Control siRNA was purchased from Qiagen (1027280). 
FOXA1 siRNA 1 (M-010319) and 2 (sense 5'-GAGAGAAAAAAUCAACAGC-3’; 
antisense 5’-GCUGUUGAUUUUUUCUCUC-3’)’ were purchased from or synthe- 
sized by Dharmacon. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Antibodies. Specific antibodies were purchased from the following commercial 
sources: anti-FoxA1 (ab5089), anti-H3K4mel (ab8899), anti-H3K27ac (ab4729), 
anti-H3K36me3 (ab9050) and H2A.Z (ab4174) from Abcam; anti-AR (N-20), 
anti-FoxAl (C-20) and p300 (C-20, gift of B. Ren) from Santa Cruz 
Biotechnology; anti-H3K4me2 (07-030), anti-H3K4me3 (07-473), anti-H4K5ac 
(07-327) and anti-H3K27me3 (07-449) from Upstate Biotechnology; anti-Med12 
(A300-774A) from Bethyl Laboratories; and anti-beta Actin (AC74) from Sigma. 
siRNA transfection. One day before transfection, LNCaP cells were seeded in 
RPMI 1640 medium with 10% FBS. Six hours after siRNA transfection (20 pmol 
ml!) with Lipofectamine 2000 (Invitrogen), cells were washed twice with PBS 
and then maintained in hormone-deprived phenol-free RPMI 1640 media. For 
gene expression profiling and western blotting, cells were cultured for 3 days after 
transfection and then treated with DHT for 20h; for ChIP-qPCR and ChIP-seq, 
cells were cultured for 4 days after transfection and then treated with DHT for 1h. 
ChIP-seq analysis at the nucleosome resolution was based on cells treated with 
DHT for 4h. 

3C assay. Cells were crosslinked with 1% formaldehyde for 20 min at room tem- 
perature and processed according to the standard 3C protocol"'. For the study on 
the FASN locus, fixed chromatin from 5 X 10° cells was digested with 400 units of 
BglII and EcoR I (NEB). For the NDRG1 locus, fixed chromatin from 5 X 10° cells 
was digested with 400 units of HindIII (NEB). Ligation was done with 800 units of 
T4 DNA ligase (NEB) for 4h. The 3C product was quantified by qPCR after 
diluting the template tenfold compared with purified genomic DNA of known 
concentration. For each semi-quantitative PCR, the amount of template was 
titrated to determine the linear range in which the PCR product was amplified. 
PCR primers were designed next to BglII and HindIII restriction sites, respectively, 
for the FASN (all in minus strand) promoter (5’-AAGCTGTGAGTCAGCAT 
GGTAG-3’) and three upstream sites (—38kb, 5’-TGTCTTCTGATGTGTCTG 
CTTAGAG-3'; —45kb, 5’-AATCCTGCTCAGGAATCTGTATGT-3’; —54kb, 
5'-GGACACTACTGCTTTTTCCTGTG-3’) and for the NDRGI (all in plus 
strand) promoter (5’-ATAGGTTCTGCCTTATTAGGG-3’) and three upstream 
sties (—42kb, 5’-ATAGAGTTAGAGAAACGGAGGCAGT-3’; —56kb: 5'-GCC 
GTGAAGAATAAACAAGATGAG-3’; —62kb: 5’-ACACATTTTGTTCCCAG 
TGCAG-3’). 

Co-IP and western blotting analysis. HEK293 cells were seeded for 1 day, trans- 
fected with the expression plasmids expressing wild-type, mutant AR and FoxA1 
using Lipofectamine2000 (Invitrogen) and then changed to hormone-depleted, 
phenol-free DMEM medium. One day after plasmid transfection, cells were 
treated with 100nM DHT for another day. Cells were washed by cold PBS twice 
and treated with 1 ml of lysis buffer (50 mM Tris pH 8.0, 150 mM NaCl, 1% NP- 
40) supplemented with a cocktail of proteinase inhibitors (Sigma) for 5 min at 
4 °C. Lysed cells were collected, rotated for 1 h at 4°C and cell debris removed by 
centrifugation at 18,000g for 30 min in a cold room. The supernatant was incu- 
bated with anti-AR, anti-FoxA1 or immunoglobulin-G overnight at 4 °C followed 
by the addition of 50 ll of 50% protein G beads to each tube. After rotating for 
another 2h at 4 °C, the beads were washed five times with the lysis buffer, twice 
with cold PBS, and boiled for 6 min in 40 ll of 2 X SDS loading buffer. Western 
blotting analysis was performed with anti-AR or anti-FoxA1. 

Luciferase reporter assay. PC3-AR and HEK293 cells were seeded into 24-well 
plates in hormone-depleted and phenol-free RMPI 1640 medium and DMEM 
1 day before transfection. Transfection was performed according to the manufac- 
turers’ recommendations (DOTAP Liposomal Transfection Reagent from Roche 
or Lipofectamin 2000 from Invitrogen). One day after transfection, these cells were 
treated with DHT for an additional day. After washing with cold PBS twice, cells 
were treated with the lysis buffer (Promega) and the Luciferase signal was 
recorded. 

Cell proliferation assay. The assay was based on the published protocol”. Briefly, 
LNCaP cells were transfected with control siRNA and FoxA1 siRNA (sequences 
listed in Methods Summary) and cultured in hormone-depleted medium for 
3 days. The cells were treated with different amount of DHT for another day. 
After the treatment, cells were washed by PBS, fixed by 70% EtOH and stored 
at —20°C for at least 2h. Before analysis, cells were washed with cold PBS, re- 
suspended at the propidium-iodide/Triton X-100 staining solution and incubated 
at 37°C for 15min. After removing cell clumps, stained cells were sorted on a 
Beckman FASCan, and the percentage of S-phase cells was calculated. 

ChIP and ChIP-seq analyses. ChIP was as previously described”. Briefly, 
approximately 10’ treated cells were crosslinked with 1% formaldehyde at room 
temperature for 15 min. After sonication, the soluble chromatin was incubated 
with 1-5 ug of antibody. Specific immunocomplexes were precipitated with 
Protein A/G beads (Sigma-Aldrich). Complexes were washed, DNA extracted 
and purified by QIAquick Spin columns (Qiagen). Extracted DNA (1 pl from 
60 pl) was used for qPCR with the specific PCR primers listed in Supplementary 


Fig. 24, each of which was designed surrounding a specific region of 150-250 base 
pairs (bp) on target DNA. PCR products were detected with SYBR Green on a 
MX3000P System (Stratagene) and the percentage of immunoprecipitated chro- 
matin was calculated from ACt relative to immunoglobulin-G control after nor- 
malizing against input chromatin. For ChIP-seq, extracted DNA was ligated to 
specific adaptors followed by deep sequencing in the Illumina GAII system 
according to the manufacturer’s instructions. The first 25 bp for each sequence 
tag returned by the Illumina Pipeline was aligned to the hg18 assembly (National 
Center for Biotechnology Information, build 36.1) using Bowtie, allowing up to 
two mismatches. Only tags uniquely mapped to the genome were used for further 
analysis. The data were visualized by preparing custom tracks for the University of 
California, Santa Cruz, genome browser using HOMER® (http://biowhat.ucsd. 
edu/homer). The total number of mappable reads was normalized to 10” for each 
experiment presented in this study. ChIP-seq at nucleosome resolution was per- 
formed as previously reported**. A summary of ChIP experiments is provided in 
Supplementary Fig. 25. 

Identification of ChIP-seq peaks. The identification of ChIP-seq peaks (bound 
regions) was performed using HOMER (http://biowhat.ucsd.edu/homer). For 
transcription factors, peaks were identified by searching locations of high read 
density using a 200-bp sliding window. Regions of maximal density exceeding a 
given threshold were called as peaks, and we required adjacent peaks to be at least 
500 bp away to avoid redundant detection. Only one tag from each unique position 
was considered to avoid clonal artefacts from the sequencing. The threshold for the 
number of tags that determined a valid peak was selected at a false discovery rate of 
0.001 determined by peak finding using randomized tag positions in a genome 
with an effective size of 2X 10° bp. We also required peaks to have at least fourfold 
more tags (normalized to total count) than input control samples. In addition, we 
required fourfold more tags relative to the local background region (10 kb) to avoid 
identifying regions with genomic duplications or non-localized binding. 

The peak finding procedure was modified to identify regions harbouring spe- 
cific histone modifications, as these experiments tend to yield broad areas of 
enrichment over several hundreds or thousands of base pairs. Seed regions were 
initially found using a peak size of 500 bp at the false discovery rate of 0.001 to 
identify enriched loci. Enriched loci found within 1 kb of one another were then 
merged to yield variable-length regions. Transcription factor peaks and histone 
modification regions were associated with gene products by identifying the nearest 
RefSeq TSS. Annotated positions for promoters, exons, introns and other features 
were based on RefSeq transcripts and repeat annotations from University of 
California, Santa Cruz. Peaks from separate experiments were considered equival- 
ent/co-bound if their peak centres were located within 200 bp of each other. Read 
density heat maps were created by first using HOMER to generate read densities 
and then visualized using Java TreeView (http://jtreeview.sourceforge.net). 
HOMER for de novo motif discovery and known motif enrichment. Motif 
discovery was performed using a comparative algorithm similar to those previously 
described”. An in-depth description will be published elsewhere (Benner et al., in 
preparation). Motif finding for transcription factors was performed on sequence 
from +100 bp relative to the peak centre, whereas motif finding for histone modi- 
fication regions was performed on sequence from +500 bp relative to the region 
centre. Briefly, sequences were divided into target and background sets for each 
application of the algorithm. Background sequences were then selectively weighted 
to equalize the distributions of G + C content in target and background sequences 
to avoid comparing sequences of different general sequence content. Motifs of 
length 8, 10, 12, 14, 16 and 18 bp were identified separately by first exhaustively 
screening all oligonucleotides for enrichment in the target set compared with the 
background set using the cumulative hypergeometric distribution to score enrich- 
ment. Up to three mismatches were allowed in each oligonucleotide sequence to 
increase the sensitivity of the method. The top 200 oligonucleotides of each length 
with the lowest P values were then converted into probability matrices and heur- 
istically optimized to maximize hypergeometric enrichment of each motif in the 
given data set. As optimized motifs were found they were removed from the data set 
to facilitate the identification of additional motifs in subsequent rounds. HOMER 
also screens the enrichment of known motifs previously identified through the 
analysis of published ChIP-ChIP and ChIP-Seq data sets by calculating the known 
motifs’ hypergeometric enrichment in the same set of G + C normalized sequences 
used for de novo analysis. Sequence logos were generated using WebLOGO (http:// 
weblogo.berkeley.edu). Motif enrichment heatmaps and dendrograms were created 
by clustering hypergeometric log P values using Cluster (http://bonsai.ims. 
u-tokyo.ac.jp/~mdehoon/software/cluster/software.htm#ctv) and Java TreeView 
(http://jtreeview.sourceforge.net). 

GRO-segq. Global run-on*° and library preparation for sequencing” were done as 
described. Briefly, four 10-cm plates of confluent LNCaP cells per treatment were 
washed three times with cold PBS buffer. Cells were then swelled in swelling buffer 
(10 mM Tris pH 7.5, 2mM MgCl, 3 mM CaCl) for 5 min on ice. Harvested cells 
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were re-suspended in 1 ml of the lysis buffer (swelling buffer with 0.5% IGEPAL 
and 10% glycerol) with gentle vortex and brought to 10 ml with the same buffer for 
extraction of nuclei. Nuclei were washed with 10 ml of lysis buffer and re- 
suspended in 1 ml of freezing buffer (50mM Tris pH 8.3, 40% glycerol, 5mM 
MgCl, 0.1 mM EDTA), pelleted down again and finally re-suspended in 100 ul of 
freezing buffer. 

For run-on assay, re-suspended nuclei were mixed with an equal volume of 
reaction buffer (10 mM Tris pH 8.0, 5mM MgCl, 1mM DTT, 300 mM KCl, 20 
units of SUPERase-In, 1% Sarkosyl, 500 uM ATP, GTP and Br-UTP, 2 uM CTP) 
and incubated for 5min at 30°C. Nuclei RNA were extracted with TRIzol LS 
reagent (Invitogen) according to the manufacturer’s instructions. RNA was then 
re-suspended in 20 jl of DEPC-water and subjected to base hydrolysis by addition 
of 5 pl of 1 M NaOH and incubated on ice for 40 min. Then, 25 pl of 1 M Tris pH 
6.8 was added to neutralize the reaction. RNA was purified through a p-30 RNase- 
free spin column (BioRad), according to the manufacturer’s instructions and 
treated with 6.711 of DNase buffer and 10pul of RQ1 RNase-free DNase 
(Promega), and purified again through a p-30 column. A volume of 8.5 pil 10 x 
antarctic phosphatase buffer, 1 pl of SUPERase-In and 5 ul of antarctic phospha- 
tase was added to the run-on RNA and treated for 1 h at 37 °C. Before proceeding 
to immunopurification, RNA was heated to 65 °C for 5 min and kept on ice. 

Anti-BrdU argarose beads (Santa Cruz Biotech) were blocked in blocking buffer 
(0.5 X SSPE, 1 mM EDTA, 0.05% Tween-20, 0.1% PVP, and 1 mg ml !' BSA) for 
1h at 4°C. Heated run-on RNA (~85 ul) was added to 60-11 beads in 500 pl 
binding buffer (0.5 X SSPE, 1 mM EDTA, 0.05% Tween-20) and allowed to bind 
for 1h at 4°C with rotation. After binding, beads were washed once in low salt 
buffer (0.2 X SSPE, 1 mM EDTA, 0.05% Tween-20), twice in high salt buffer (0.5% 
SSPE, 1 mM EDTA, 0.05% Tween-20, 150 mM NaCl) and twice in TET buffer (TE 
pH 7.4, 0.05% Tween-20). BrU-incorporated RNA was eluted with 4 x 125 ul 
elution buffer (20mM DTT, 300mM NaCl, 5mM Tris pH 7.5, 1mM EDTA 
and 0.1% SDS). RNA was then extracted with acidic phenol/chloroform once, 
chloroform once and precipitated with ethanol overnight. The precipitated 
RNA was re-suspended in 50 l reaction (45 pl of DEPC water, 5.2 ul of T4 
PNK buffer, 1 pl of SUPERase_In and 1 yl of T4 PNK (NEB)) and incubated at 
37 °C for 1h. The RNA was extracted and precipitated again as above. 

Complementary DNA (cDNA) synthesis was performed basically as described”? 
with minor modifications. First, RNA fragments were subjected to poly-A tailing 
reaction in 8.0 pil volume containing 0.8 11 poly-A polymerase buffer, 1 1 1mM 
ATP, 0.5 pl SUPERase-In and 0.75 ul poly-A polymerase (NEB). The reaction was 
performed for 30 min at 37 °C. Subsequently, reverse transcription was performed 
using oNTI223 primer (5'-pGATCGTCGGACTGTAGAACTCT;CAAGCAGA 
AGACGGCATACGATTTTTTTTTITTTTTITITTTTVN-3’) where the p indi- 
cates 5’ phosphorylation, ‘; indicates the abasic dSpacer furan and VN indicates 
degenerate nucleotides. 

Tailed RNA (8.01) was mixed with 1 ul dNTP (10mM each) and 2.5 ul 
12.5 .M oNTI223, heated for 3 min at 75°C and chilled briefly on ice. Then, 
0.5 pl SUPERase-In, 3 pl 0.1M DTT, 2 yl 25mM MgCh, 2 ul 10 X reverse tran- 
scription buffer and 1 ul superscript III reverse transcriptase (Invitrogen) was 
added to the tube. The tube was incubated for 30 min at 48°C. After that, 4 ul 
of Exonuclease I (Fermentas) was added into the reaction and the tube was incu- 
bated for 1h at 37 °C to eliminate extra oNTI223. Then RNA was eliminated by 
adding 1.8 ul 1M NaOH and incubated for 20 min at 98 °C. The reaction was 
neutralized with 1.8 pil of 1 M HCL. After running on a 10% polyacrylamide TBE- 
urea gel, the extended first-strand cDNA product was excised and recovered by 
soaking the grinded gel in DNA gel elution buffer (TE with 0.1% Tween-20 and 
150 mM NaCl) overnight and then precipitated with ethanol. 

Circularization of first-strand cDNA was performed by re-suspending cDNA in 
9.5 ul reaction solution (7.5 pil water, 1 pl CircLigase buffer, 0.5 pl 1 mM ATP and 
0.5 150 mM MnC1,) and then adding 0.5 il CircLigase (Epicentre). The reaction 
went for 1h at 60°C and then was heat-inactivated for 20min at 80°C. 
Circularized single-stranded DNA (ssDNA) was relinearized by adding 3.8 ll of 
4 X relinearization supplement (100 mM KCl, 2mM DTT) followed by 1.5 pl of 
Apel (15u, NEB). The reaction was incubated for 1 h at 37 °C. Relinearized ssDNA 
was separated in a 10% polyacrylamide TBE-urea gel (Invitrogen) as described 
above. The relinearized product band was excised (~ 120-300 bp) and the DNA 
was recovered as described above. 

The ssDNA template was amplified by PCR using the Phusion High-Fidelity 
enzyme (NEB) according to the manufacturer’s instructions. The oligonucleotide 
primers oNTI200 (5'-CAAGCAGAAGACGGCATA-3’) and oNTI201 (5'-AATG 
ATACGGCGACCACCGACAGGTTCAGAGTTCTACAGTCCGACG-3') were 
used to generate DNA for sequencing. PCR was performed with an initial 30-s 
denaturation at 98 °C, followed by 13 cycles of 10-s denaturation at 98°C, 15-s 
annealing at 60 °C and 15-s extension at 72°C. The PCR product was run on a 
non-denaturing 8% polyacrylamide TBE gel and recovered as mentioned before. 
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DNA was then sequenced on the Illumina Genome Analyser II according to the 
manufacturer’s instructions with small RNA sequencing primer 5'-CGACAGG 
TTICAGAGTTCTACAGTCCGACGATC-3’. 

De novo identification of GRO-Seq transcripts. To identify transcription units 
in an unbiased manner, GRO-Seq read densities were analysed to classify genomic 
regions into contiguous transcripts using HOMER. GRO-Seq read densities were 
initially normalized using the GC content of individual reads to remove any 
systematic bias introduced by overall GC content variation between read libraries. 
To maximize read depth for transcript identification, all GRO-Seq libraries were 
merged to perform the initial transcript discovery, and later considered separately 
to identify regulated transcripts. For each strand of each chromosome, GRO-Seq 
read densities were calculated using a sliding window of 250 bp. Regions for which 
GRO-Seq reads could not be uniquely mapped (that is, repeats) were first iden- 
tified and then read densities in these regions were estimated using upstream read 
densities from mappable regions to avoid ending predicted transcripts prema- 
turely. Transcript initiation sites were identified as regions where the GRO-Seq 
read density increased threefold relative to the previous 1 kb region. Transcript 
termination sites were defined by either a reduction in reads below 10% of the start 
of the transcript or when another transcript’s start site occurred on the same 
strand. Single spikes in read density covering a span less than 250 bp were con- 
sidered artefacts and discarded. Identified transcripts were strand-specifically 
compared with RefSeq transcripts by looking for overlap in the transcribed region. 
Transcripts were defined as putative eRNAs if their TSS was located distal to 
RefSeq TSS (>3kb) and were associated with H3K4mel1 regions. To identify 
differentially regulated transcripts, strand-specific read counts from each GRO- 
Seq experiment were determined for each transcript using HOMER™. EdgeR 
(http://www. bioconductor.org/packages/release/bioc/html/edgeR.html) was then 
used to calculate differential expression (>1.5-fold, <0.01 false discovery rate). 
Microarray and reverse-transcription qPCR analyses of gene expression. Total 
RNA was isolated with the RNeasy Mini Kit (Qiagen) and treated by RNase-free 
DNase I. For PCR with reverse transcription, first-strand cDNA synthesis from total 
RNA was performed with the Superscript III cDNA Synthesis System (Invitrogen). 
Microarray analysis was performed on Human V2 Chips (Illumina). The published 
gene expression profiling data GDS2545 (refs 36, 37) and GDS1439 (ref. 38) were 
extracted from the National Center for Biotechnology Information, normalized and 
P values calculated by two-tailed t-test. For validation by PCR with reverse tran- 
scription, cDNA was analysed with SYBR Green (Stratagene) on the Mcx300P 
System (Stratagene). The relative messenger RNA level was calculated by comparing 
with non-treatment control, after normalization with GAPDH or ACTB messenger 
RNA. The primers for RT-qPCR (5’-3') were as follows: ACTB-5, CGTCCCAGT 
TGGTGACGATG; ACTB-3, GCCGTCTTCCCCTCCATC; GAPDH-5, GTTTT 
TCTAGACGGCAGGTCAGG; GAPDH-3, AACATCATCCCTGCCTCTACTGG; 
KLK3-5, TGTGTGCGCAAGTTCACC; KLK3-3, GGTTCACTGCCCCATGAC; 
RASSF3-5, GACGCCGAGGACTTCTTCTT; RASSF3-3, TGCTGAGGTAACT 
GTGGGTTT; SOX9-5, GACTCGCCACACTCCTCCTC; SOX9-3, AAGTCGAT 
AGGGGGCTGTCT; IL6R-5, GAGATTCTGCAAATGCGACA; IL6R-3, GITGGG 
GAGATGAGAGGAACA; DNM2-5, TGTTTGCCAACAGTGACCTC; DNM2-3, 
CCCAGACCACTGAAGCTCCT. 

Survival analysis. Two independent sets of gene expression data were used to 
check the association between FoxA1 and clinical outcome of patients by Kaplan- 
Meier analysis. One data set came from 78 patients with prostate tumours (age 
<70)”, the other from 131 patients”. Significant association with outcome was 
determined by log-rank test for survival. Hazard ratios were calculated by the Cox 
proportional model. All statistics were analysed with the statistical software R 
(version 2.6.2), available from the R Project for Statistical Computing website 
(http://www.r-project.org). The cut-off was determined so that the log-rank P 
value was the smallest one in the cut-offs that went through the 5th-95th percen- 
tiles of signals. 


31. Hu, Q. etal. Enhancing nuclear receptor-induced transcription requires nuclear 
motor and LSD1-dependent gene networking in interchromatin granules. Proc. 
Natl Acad. Sci. USA 105, 19199-19204 (2008). 

32. Holbro, T. et a/. The ErbB2/ErbB3 heterodimer functions as an oncogenic unit: 
ErbB2 requires ErbB3 to derive breast tumor cell proliferation. Proc. Natl Acad. Sci. 
USA 100, 8933-8938 (2003). 

33. Heinz, S. et al. Simple combinations of lineage-determining transcription factors 
prime cis-regulatory elements required for macrophage and B cell identities. Mol. 
Cell 38, 576-589 (2010). 

34. Barski, A. et a/. High-resolution profiling of histone methylations in the human 
genome. Cell 129, 823-837 (2007). 

35. Linhart, C., Halperin, Y. & Shamir, R. Transcription factor and microRNA motif 
discovery: the Amadeus platform and a compendium of metazoan target sets. 
Genome Res. 18, 1180-1189 (2008). 

36. Chandran, U. R. et al. Gene expression profiles of prostate cancer reveal 
involvement of multiple molecular pathways in the metastatic process. BMC 
Cancer 7, 64 (2007). 


©2011 Macmillan Publishers Limited. All rights reserved 


LETTER 


37. Yu, Y. P. et al. Gene expression alterations in prostate cancer predicting tumor 39. Glinsky, G. V., Glinskii, A. B., Stephenson, A. J., Hoffman, R. M. & Gerald, W. L. Gene 
aggression and preceding development of malignancy. J. Clin. Oncol. 22, expression profiling predicts clinical outcome of prostate cancer. J. Clin. Invest 
2790-2799 (2004). 113, 913-923 (2004). 

38. Varambally, S. etal. Integrative genomic and proteomic analysis of prostatecancer 40. Taylor, B.S. et al. Integrative genomic profiling of human prostate cancer. Cancer 
reveals signatures of metastatic progression. Cancer Cell 8, 393-406 (2005). Cell 18, 11-22 (2010). 


©2011 Macmillan Publishers Limited. All rights reserved 


Le ER 


doi:10.1038/nature10013 


Interannual atmospheric variability forced by the 
deep equatorial Atlantic Ocean 


Peter Brandt', Andreas Funk', Verena Hormann‘}, Marcus Dengler', Richard J. Greatbatch' & John M. Toole? 


Climate variability in the tropical Atlantic Ocean is determined by 
large-scale ocean-atmosphere interactions, which particularly affect 
deep atmospheric convection over the ocean and surrounding con- 
tinents'. Apart from influences from the Pacific El Niio/Southern 
Oscillation’ and the North Atlantic Oscillation’, the tropical Atlantic 
variability is thought to be dominated by two distinct ocean- 
atmosphere coupled modes of variability that are characterized by 
meridional** and zonal®’ sea-surface-temperature gradients and are 
mainly active on decadal and interannual timescales, respectively*”. 
Here we report evidence that the intrinsic ocean dynamics of the deep 
equatorial Atlantic can also affect sea surface temperature, wind and 
rainfall in the tropical Atlantic region and constitutes a 4.5-yr climate 
cycle. Specifically, vertically alternating deep zonal jets of short ver- 
tical wavelength with a period of about 4.5 yr and amplitudes of more 
than 10 cms ' are observed, in the deep Atlantic, to propagate their 
energy upwards, towards the surface’°"’. They are linked, at the sea 
surface, to equatorial zonal current anomalies and eastern Atlantic 
temperature anomalies that have amplitudes of about 6cms ‘ and 
0.4 °C, respectively, and are associated with distinct wind and rainfall 
patterns. Although deep jets are also observed in the Pacific’? and 
Indian”’ oceans, only the Atlantic deep jets seem to oscillate on inter- 
annual timescales. Our knowledge of the persistence and regularity 
of these jets is limited by the availability of high-quality data. Despite 
this caveat, the oscillatory behaviour can still be used to improve 
predictions of sea surface temperature in the tropical Atlantic. 
Deep-jet generation and upward energy transmission through the 
Equatorial Undercurrent warrant further theoretical study. 

Tropical Atlantic variability, which modulates the seasonal migra- 
tion of the intertropical convergence zone, is dominated by two modes 
of behaviour*”. The meridional mode, peaking during boreal spring, is 
characterized by a north-south sea-surface-temperature (SST) gra- 
dient that drives cross-equatorial wind anomalies from the cold hemi- 
sphere to the warm*”. The zonal mode is characterized by an east-west 
SST gradient along the Equator and is associated with marked zonal 
wind anomalies®”. It is most pronounced during boreal summer when 
the seasonal maximum in equatorial upwelling leads to the develop- 
ment of the eastern Atlantic SST cold tongue. The zonal mode is often 
referred to as the Atlantic counterpart to the Pacific El Nino. The 
period of zonal-mode-like oscillations estimated from observations, 
models and theory ranges from 19 months to 4 years*'*"°. However, 
aspects of the intrinsic ocean dynamics, such as year-to-year variations 
in the strength of tropical instability waves, are similarly identified as 
causes of interannual SST variability’ and may themselves be able to 
force variability in the atmosphere. 

During the past 10-20 yr, the eastern equatorial Atlantic SST, repre- 
sented by the ATL3 index (that is, the average SST anomaly inside the box 
shown in Fig. la), has shown pronounced variability on interannual 
timescales, dominated by the period range of 4-5 yr; maximum explained 
variance of different ocean parameters is found at a period of 1,670d 
(Supplementary Fig. 1). The associated harmonic amplitude of local 


SST fluctuations, which is 0.29 + 0.08°C averaged over the ATL3 
region, is generally high in the eastern equatorial Atlantic, with local 
amplitudes of up to 0.4°C (Fig. 1a and Supplementary Fig. 2). The 
regression of surface winds and rainfall on the 1,670-d SST harmonic 
reveals that anomalous westerlies along the Equator, convergent 
meridional wind anomalies particularly in the western tropical 
Atlantic, and positive rainfall anomalies in a wide belt around the 
Equator are associated with positive SST anomalies. 

A 1,670-d cycle is also found in the surface geostrophic zonal velocity 
anomaly at the Equator and is again the dominant interannual variability, 
with a harmonic amplitude of 5.9 + 1.9 cms‘. Phases of eastward sur- 
face flow coincide with SST warm phases in the eastern equatorial 
Atlantic (Fig. 1b). Whereas the 1,670-d period stands out as the dominant 
interannual variability timescale of the equatorial zonal surface flow, this 
is not the case for the wind forcing, which instead shows more irregular 
fluctuations during the analysed time interval (NCEP/NCAR reanalysis 
wind data). Such a dominant signal in the ocean seems to contradict early 
model results, in which the equatorial ocean response to wind forcing 
with periods longer than about 150d was found to be a succession of 
equilibrium responses with the strength of the flow independent of the 
forcing period’*. As we show below, variability in the 4-5-yr period band 
is a ubiquitous feature of the equatorial Atlantic and, furthermore, is 
associated with upward propagation of energy in the ocean. We propose 
that the variability in the equatorial zonal surface flow is not due to wind 
forcing with the same period but rather is a mode internal to the ocean, 
with its origin in the abyss (perhaps as deep as several thousand metres). 
If this is indeed the case, then the observed atmospheric variability in the 
4-5-yr period band in the equatorial Atlantic can be interpreted as a 
consequence of internal ocean dynamics. 

Analysis of zonal velocities at 1,000-m depth as observed by Argo 
floats’” reveals periodic behaviour similar to that of the SST and surface 
geostrophic zonal velocity anomalies (Fig. 1b). The dominant period, of 
4A yr, in the Argo float drift data for the period 1998-2010 is in agreement 
with earlier estimates from moored zonal velocity observations in the 
depth range 600-1,800 m made during 2000-2006" (4.4 yr) and with 
the estimate from hydrographic observations made during 1972-1998" 
(5 + Lyr). The deep velocity and density fluctuations have been dynam- 
ically described as a mixture of high-baroclinic-mode Kelvin and Rossby 
waves representing quasi-steady equatorial deep jets'®"’. Such vertically 
alternating zonal jets with vertical wavelengths between 300 and 700 mare 
similarly present in the Pacific’”° and Indian oceans’**". In the Atlantic, a 
downward phase velocity of equatorial deep jets (of about 100 m yr’ ') is 
observed" that corresponds, according to linear internal wave theory, to 
upward energy propagation. Our moored observations reveal downward 
phase propagation from below the Equatorial Undercurrent (EUC) at 
about 200-m depth to about 2,000-m depth (Fig. 2 and Supplementary 
Fig. 3), suggesting a deep generation mechanism for equatorial deep jets. 
Observed variations in the vertical phase velocity are probably due to 
changes in the amplitudes of different superimposed baroclinic modes, 
as also indicated by changes in the vertical wavelength (Fig. 2). Theories of 
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Figure 1 | Interannual variability in the tropical 
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Atlantic associated with a 1,670-d cycle. 

a, Anomalies of SST (colour scale), surface wind 
(arrows) and rainfall (white contours: solid, 
positive; dashed, negative; every 0.15mmd_') as 
determined through regression on the harmonic fit 
of the SST anomalies (microwave optimally 
interpolated SST) averaged within the marked box 
(ATL3: 3° S—3° N, 20° W-0°). We mark significant 
correlations (95%) of harmonic fit with SSTs (black 
lines), winds (black arrows) and rainfall (white 
dotted lines). b, ATL3 SST anomaly (microwave 
optimally interpolated SST, red dashed; HadISST, 
red thin solid) with 1,670-d harmonic fit (red thick 


(O.) Afewoue 1SS 


solid), surface zonal velocity anomaly (Equator, 
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35° W-15° W; black thin solid) with 1,670-d 
harmonic fit (black thick solid), and 1,000-m zonal 
velocity (1° S-1° N, 35° W-15° W; black dots with 


Zonal velocity (m s~') 


standard errors) with 1,670-d harmonic fit (black 
thick dashed). 
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deep-jet generation involve instabilities associated with the propagation of 
intraseasonal mixed Rossby gravity waves”** or the Equator-crossing 
deep western boundary current”*. However, until now the proposed 
theories have failed to explain the observed strength and complex 
behaviour of the deep jets in the different ocean basins. 

Propagation of deep-jet energy towards the surface is complicated 
by the presence of a strong, vertically-sheared mean current, the EUC, 
with maximum eastward velocities of more than 60cms ‘ at about 
80-m depth (Fig. 3a). Theoretical studies indicate that the EUC effec- 
tively modifies dispersion characteristics of Kelvin and Rossby 
waves”. On seasonal timescales, the background flow partly inhibits 
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the downward propagation of high-baroclinic-mode energy, explaining 
the dominance of low-baroclinic-mode seasonal waves at depth. 
Theoretical studies of internal wave propagation motivated by observed 
internal wave transmissions across an atmospheric jet suggest, however, 
that an energy transfer across critical levels—that is, where the horizontal 
phase velocity equals the background mean flow—is possible”. 

The amplitude of the 1,670-d harmonic oscillation of zonal velocity 
in the upper 600 m of the water column is largest in the 300-600-m 
depth interval (Fig. 3b), where it explains up to 60% of the variance 
contained in the monthly zonal velocity anomalies (Fig. 3d). Local 
minima in the amplitude of the 1,670-d oscillation are indicated near 
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>20 Figure 2 | Zonal velocities at the Equator, 23° W. 
Velocity data above 600 m are from a moored 
acoustic Doppler current profiler with annual and 
15 semi-annual cycles subtracted, those between 600 
and 1,000 m are from two single-point current 
meters, and those below 1,000 m are from a 
10 moored profiler. The white areas mark depths not 
sampled by the deployed instrumentation. 
Linearized phase lines (eastward jets, solid; 
westward jet, dashed) of equatorial deep jets are 
calculated from about 7-yr of moored current data 
(above 600 m) and from the presented data (below 
1,000 m). Associated vertical phase velocities are 
given in the figure. 
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Figure 3 | Mean zonal velocity, zonal temperature gradient and harmonic 
analysis of 1,670-d oscillation. a, Moored mean zonal velocity (U) at the 
Equator, 23 °W (black), and climatological”” mean zonal temperature (T) 
difference at the Equator between 0° and 30° W (red). b-d, 1,670-d harmonic 
amplitude (b), phase (c) and explained variance (d) of equatorial moored zonal 
velocities at 23° W (black curves (U23 w)); equatorial surface zonal velocity 
averaged between 35° W and 15° W (black dots), and subsurface temperatures 
(red curves ( To w) and small red dots) and microwave optimally interpolated 
SST (big red dot) at the Equator, 10° W. Zero phase corresponds to 1 January 
1993; explained variance is calculated using monthly mean data with the mean 
seasonal cycle subtracted. Information on the calculation of error bars in b and 
c can be found in Methods. 


the core and at the lower boundary of the EUC (Fig. 3b), and amplitudes 
of about 6cms ’ are derived at the surface. The variance explained by 
the 1,670-d harmonic oscillation decreases towards the surface 
(Fig. 3d), mainly as a result of the increasing strength of intraseasonal 
fluctuations. Although the vertical phase propagation is consistently 
downward below the EUC, the phase jumps by about 180° at the lower 
boundary of the EUC (Fig. 3c), approximately at the critical level for the 
propagation of high-baroclinic-mode equatorial Kelvin waves. 

The 1,670-d fluctuations are also pronounced in subsurface 
temperature records. Temperatures are affected in two ways by the 
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presence of equatorial deep jets: isopycnal displacements associated 
with the deep jets will lead to temperature variations that are phase- 
shifted in space and time relative to the velocity anomalies, depending 
on the character (Rossby or Kelvin) of the wave’®; and in the presence 
of climatological zonal temperature gradients, zonal advection asso- 
ciated with the jets might induce changes in the temperature fields. For 
example, in-phase oscillations of surface zonal velocity and near- 
surface temperatures (Fig. 3c) are in agreement with the propagation 
of equatorial Kelvin waves; that is, eastward velocities are associated 
with downward isopycnal (isothermal) displacements and vice versa. 
A deeper thermocline could, in turn, be associated with reduced down- 
ward heat transport through diapycnal mixing causing higher SSTs. In 
the equatorial Atlantic, the climatological’’ zonal temperature gradient 
changes sign with depth, further complicating the interpretation of the 
observed phase structure of the subsurface temperature variability: for 
example, the reversal of the zonal temperature gradient with depth in 
the lower part of the EUC (Fig. 3a) might be responsible for the phase 
shift with depth of the 1,670-d harmonic oscillation of the subsurface 
temperature (Fig. 3c). Although the understanding of the propagation 
characteristics of the jets in the presence of strong mean currents and 
zonal tracer gradients deserves further theoretical study, these obser- 
vations suggest that equatorial deep-jet energy propagates to the sur- 
face and affects sea surface conditions. 

Observations in the equatorial Atlantic reveal a similar periodic 
behaviour for deep-jet oscillations over different time intervals and 
depth ranges'®"’. Such consistent behaviour could arise from the 
development of high baroclinic basin modes” established by the 
eastward and westward propagation of Kelvin and Rossby waves, 
respectively~*. In this case, vertical phase and energy propagation can 
occur only for quasi-resonant modes with active forcing and dissipa- 
tion. The basin width of the Indian Ocean suggests a similar period for 
equatorial deep-jet oscillations as in the Atlantic, with rather different 
behaviour in the Pacific as a result of the much greater basin width. 
Argo float drift data from about 1,000-m depth represent a consistent 
data set that is available for all three oceans’’. In the Atlantic, maximum 
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explained variance is found for westward-propagating Rossby waves of 
baroclinic mode 13 (corresponding to a vertical wavelength of about 
600 m at 1,000-m depth) and a period of 1,580 d, corresponding to a 
zonal wavelength (J =27/|k|, where k is the zonal wavenumber) of 
11 X 10° km. In the Pacific, only weak signals of high-baroclinic-mode 
variability were extracted from the approximately 7-yr-long time 
series, which could be expected as estimated deep-jet oscillation periods 
are in the multidecadal range’*’’. The dominant signal there is asso- 
ciated with low-baroclinic-mode variability. Despite there being geo- 
metric similarities between the Indian and Atlantic oceans, during the 
analysed time frame the Indian Ocean Argo float velocities are char- 
acterized by incoherent signals in the interannual period range, with no 
preferred period (Fig. 4). From this analysis, we expect no influence of 
equatorial deep jets on the surface conditions in the Indian and Pacific 
oceans on interannual timescales. 

In analysing the seasonality of the Atlantic deep-jet surface expres- 
sions, we find that the amplitude of the 1,670-d cycle of zonal velocity 
is seasonally independent whereas the corresponding amplitudes of 
the ATL3 SST anomalies at this period are instead strongest during 
boreal summer and November/December (Supplementary Figs 6 and 
7). These periods are identified as cold seasons with shallow thermo- 
cline depths in the east and active Bjerknes positive feedback””®. 
During boreal spring when the tropical Atlantic is uniformly warm, 
the influence of the 1,670-d zonal velocity anomalies on SST is weak. 
Such behaviour is consistent with the equatorial zonal surface flow 
forced by interior ocean dynamics, whereas associated SST variations 
are seasonally modulated. On decadal timescales, the strength and 
period of the deep-jet oscillations may vary over time. The modulation 
could be due, for example, to a change in the dominant baroclinic 
mode affecting the basin mode period”*”®. Such behaviour is suggested 
by Supplementary Fig. 8, although other modes of variability, such as 
the Pacific El Nifio/Southern Oscillation* and the North Atlantic 
Oscillation’, could also be influencing the time series. Despite this 
caveat, the surface expressions of the deep jets can clearly be used to 
improve the prediction of equatorial Atlantic SST, which is crucial for 
seasonal to interannual climate forecasting in the region’. 


METHODS SUMMARY 


We calculated surface zonal velocity anomaly at the Equator, averaged between 
35° W and 15° W (Fig. 1b), by applying a second-order fit in latitude to monthly 
mean meridional sea level anomaly distributions between 1° N and 1° S and evalu- 
ating equatorial geostrophy using the obtained curvature. The standard error of 
annual mean Argo float velocities (Fig. 1b) was calculated by dividing their stand- 
ard deviation by the square root of the number of float observations. We filtered 
monthly time series (Fig. 1b) using a running annual mean. Harmonic analyses of 
zonal velocity and temperature (Figs 1b and 3) were performed by applying a 
linear regression model in a least-squares sense to the data. We approximated the 
degrees of freedom used for the calculation of the standard error of the resulting 
amplitudes as the length of the time series divided by a quarter of the deep-jet 
oscillation period. The significance of the correlation (Fig. 1a) was obtained using 
surface wind and rainfall time series of the same length as the microwave optimally 
interpolated SST with corresponding degrees of freedom. Sources and time inter- 
vals of all data sets used in this study are given in Supplementary Table 1. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Surface zonal velocity anomaly at the Equator, averaged between 35° W and 15° W 
(Fig. 1b), was calculated by applying a second-order fit in latitude to monthly mean 
meridional sea level anomaly distributions between 1° S and 1° N and evaluating 
equatorial geostrophy using the obtained curvature. Mean zonal velocities from 
Argo float drifts between 950 and 1,050 m (Fig. 1b) were derived by removing 
outliers using a standard-deviation criterion and averaging over time (1-yr period) 
and space (from 1° S to 1° N and from 35° W to 15° W). The standard error of the 
nominal 1,000-m zonal velocities (Fig. 1b) was calculated by dividing their standard 
deviation by the square root of the number of float observations. Monthly time 
series (Fig. 1b) were filtered using a running annual mean. The dominant period of 
these time series was estimated by calculating the variance explained by a plane- 
wave fit (Supplementary Fig. 1). 

In the subsurface temperature and velocity time series from PIRATA buoys and 
subsurface moorings, which are used to produce Fig. 3, data gaps are present. Here 
monthly time series were derived by monthly averaging and subtracting a mean 
annual cycle. 

Harmonic analyses of zonal velocity and temperature time series (Figs 1b and 
3b, c) were performed by fitting the following linear regression model in a least- 
squares sense to the monthly data: 


dm =gB = P)Iy + B, cos (wt) + f; sin (wt) 


Here t is the time vector corresponding to the data vector, d, both of which are of 
length N; cos(ct) and sin(jt) are the vectors whose elements are the cosines and 
sines of the elements of wt, respectively; w = 21/p is the angular frequency, where 
p is the period; g is the model matrix; B is a column vector of scalar model factors 
(B,, By and (3); and Iy is a vector of length N whose elements all equal 1. The error 
matrix is given by 


-1 TT 
ie jee (d—dy,)"(d—dx) 
n—k 


where 1 is the number of degrees of freedom and k = 2 is the number of dependent 
model factors. The standard errors of the elements of B are given by the diagonal 
elements of Af. The degrees of freedom used for the calculation of the standard 
error of the resulting amplitudes were approximated as the length of the time series 
divided by a quarter of the deep-jet oscillation period, and aren = 14 for ATL3 SST 
(HadISST), n = 10 for ATL3 SST (microwave optimally interpolated SST), n = 14 
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for geostrophic zonal velocity anomaly, n = 10 for the Argo float drift data (Fig. 1b 
and Supplementary Table 2), and m = 5 to n = 7 for the moored zonal velocity and 
subsurface temperature data (Fig. 3) varying with depth owing to data gaps. The 
phase errors (Fig. 3c) are maximum errors derived using the standard errors of the 
model factors (Af2 and Af;) by applying linear error propagation for an arbitrary 
phase lag. 

The significance of the correlation (Fig. la) is obtained using surface wind and 
rainfall time series of the same length as the microwave optimally interpolated SST 
data series (Fig. 1b), which are additionally 270-d low-pass-filtered and have 
n= 10. Sources and periods of all data sets used are given in Supplementary 
Tab. 1. 

Equatorial zonal velocities from 1,000-m Argo float drift data acquired between 
1°S and 1° N were plotted as functions of time and longitude in Fig. 4. The data 
model 


d,, =U sin (kx— wt—¢Iy) 


was applied to the observed zonal velocities. Here U is the zonal velocity amplitude, 
sin(kx — wt — Ply) is the vector whose elements are the sines of the elements of 
kx — ct — $y, xis the space vector in the zonal direction corresponding to the data 
vector, k is the zonal wavenumber and ¢ is the phase. By maximizing the variance 
explained by the fit, propagation characteristics of the dominant interannual 
variability were obtained (Supplementary Fig. 4). In the Atlantic, this fit explains 
about 28% of the variance of the equatorial zonal velocity from Argo float drift data 
after subtracting the annual and semi-annual cycles. In the Pacific, the strongest 
interannual signal (which explains only 8% of the variance) is found at a period of 
740 d with a zonal wavelength of 56 X 10° km. The associated phase velocity corre- 
sponds to a first-baroclinic-mode Rossby wave that is very probably forced by the 
wind (Supplementary Fig. 4). Uncertainties in period and wavelength were esti- 
mated by a non-parametric bootstrap procedure where a number of resamples was 
constructed by random sampling with replacement (Supplementary Fig. 5). 

Moored velocity data were acquired using acoustic Doppler current profilers, 
different single-point current meters and a moored profiler (Figs 2 and 3 and 
Supplementary Fig. 3). The oceanic variability on short timescales clearly exceeds 
the measurement accuracy of the different instruments. Owing to a ballasting 
error, the moored profiler was deployed ‘light’ and suffered loss of drive-wheel 
traction over time, resulting in truncation of the down-going profiles as time 
progressed (Fig. 2). 
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Immunogenicity of induced pluripotent stem cells 


Tongbiao Zhao', Zhen-Ning Zhang’, Zhili Rong’ & Yang Xu' 


Induced pluripotent stem cells (iPSCs), reprogrammed from so- 
matic cells with defined factors, hold great promise for regenerative 
medicine as the renewable source of autologous cells'°. Whereas it 
has been generally assumed that these autologous cells should be 
immune-tolerated by the recipient from whom the iPSCs are 
derived, their immunogenicity has not been vigorously examined. 
We show here that, whereas embryonic stem cells (ESCs) derived 
from inbred C57BL/6 (B6) mice can efficiently form teratomas in 
B6 mice without any evident immune rejection, the allogeneic ESCs 
from 129/SvJ mice fail to form teratomas in B6 mice due to rapid 
rejection by recipients. B6 mouse embryonic fibroblasts (MEFs) 
were reprogrammed into iPSCs by either retroviral approach 
(ViPSCs) or a novel episomal approach (EiPSCs) that causes no 
genomic integration. In contrast to B6 ESCs, teratomas formed 
by B6 ViPSCs were mostly immune-rejected by B6 recipients. In 
addition, the majority of teratomas formed by B6 EiPSCs were 
immunogenic in B6 mice with T cell infiltration, and apparent 
tissue damage and regression were observed in a small fraction of 
teratomas. Global gene expression analysis of teratomas formed by 
B6 ESCs and EiPSCs revealed a number of genes frequently over- 
expressed in teratomas derived from EiPSCs, and several such gene 
products were shown to contribute directly to the immunogenicity 
of the B6 EiPSC-derived cells in B6 mice. These findings indicate 
that, in contrast to derivatives of ESCs, abnormal gene expression in 
some cells differentiated from iPSCs can induce T-cell-dependent 
immune response in syngeneic recipients. Therefore, the immuno- 
genicity of therapeutically valuable cells: derived from) patient- 
specific iPSCs should be evaluated before any clinic application of 
these autologous cells into the patients. 

To vigorously examine the immunogenicity of cells derived from 
iPSCs, we took advantage of the capability of ESCs and iPSCs to form 
teratomas in mice that allows the simultaneous evaluation of the 
immunogenicity of various cell)types derived from them. Whereas 
B6 ESCs could efficiently form teratomas in B6 mice without any 
evidence of immune rejection as indicated by the lack of any detectable 
CD4* T cell infiltration, ahallmark of immune rejection, the allo- 
geneic 129/SvJ(129) ESCs were rapidly rejected before forming detect- 
able teratomas in the same B6 recipients with massive infiltration of 
CD4* T cells into one detectable teratomas formed by 129 ESCs 
(Fig. la-d, Supplementary Fig. 1). The CD4* cells were not directly 
differentiated from the implanted ESCs because no CD4* cells were 
detectable in any examined teratomas formed by B6 and 129 ESCs in 
severe combined immunodeficient (SCID) mice (Fig. 1d). B6 and 129 
ESCs had similar proliferation rates and both could efficiently form 
teratomas in SCID mice (Supplementary Fig. la—e). Therefore, these 
findings validate the feasibility to use this teratomas formation assay to 
evaluate the immunogenicity of iPSC derivatives in vivo. 

We initially established ViPSCs from B6 MEFs with the cocktails of 
retrovirus expressing either three (Oct4/Sox2/K1f4) or four (Oct4/ 
Sox2/myc/KIf4) reprogramming factors as described’. The subcloned 
ViPSCs had normal karyotypes, expressed ESC-specific surface 
markers and pluripotency genes, and were pluripotent as indicated 
by their capability to form teratomas in SCID mice and contribute 
to adult chimaeric mice (Supplementary Fig. 2a—g). Four independent 


iPSC clones, two reprogrammed with three factors (V3-1 and V3-3) 
and two with four factors (V4-1 and V4-2), were selected for further 
analysis (Supplementary Fig. 2h). Most implanted B6 ViPSCs failed to 
form detectable teratomas or formed teratomas that were subsequently 
immune-rejected with T cell infiltration and massive necrosis (Sup- 
plementary Fig. 3a-e). The teratomas that did not undergo apparent 
regression were also infiltrated with CD4"T cells with apparent 


b 
Gell line (number) Day 21 (%) Day 30 (%) 
C57BL/6 (1 x 10) 12/12 (100) 18/19 (95) 
(3x10)  - 12/12 (100) 
129 (1x10) 0/12(0) 0/19 (0) 
(3x10) = 1*/12 (8.3) 
— Not determined =2B6s 
d Anti-CD3 
Spleen 
ESC: B6 
Recipient: B6 
ESC: 129 ( 
Recipient: B6 
ESC: 129 


Figure 1 | Immunogenicity of syngeneic and allogenic ESCs in male B6 
mice. a, B6 but not 129 ESCs can efficiently form teratomas in B6 mice after 
subcutaneous injection. The teratomas shown is 30 days after implantation. 
b, Summary of teratomas formation by ESCs in B6 mice 21 and 30 days after 
implantation. Only one small teratoma formed by 129 ESCs was detected 
(asterisk) and is shown in c. d, Infiltration of T cells was detected in the 
teratomas formed by 129 ESCs but not the ones formed by B6 ESCs in B6 mice. 
T cells were identified by anti-CD4 and anti-CD3 antibodies. Sections from the 
spleen and teratomas formed by 129 ESCs in SCID mice were used as positive 
and negative controls. 
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necrosis within parts of the tumour (Supplementary Fig. 3d, e). 
Therefore, cells derived from B6 ViPSCs are highly immunogenic in 
B6 mice. 

Recent studies have shown the existence of T cells specific for the 
cells expressing Oct4 in the periphery’. Therefore, the reactivation of 
Oct4 expression in cells differentiated from B6 ViPSCs could induce 
immune responses in B6 mice (Supplementary Fig. 2i). To address this 
issue, we developed a novel episomal approach to reprogram B6 MEFs 
into EiPSCs that express ESC markers and pluripotency genes as well 
as contribute to adult chimaeric mice (Fig. 2a—-e). Extensive Southern 
blotting analysis demonstrated that some EiPSC clones (1E12, 1E13, 
3E1) had lost the episomal vector and harboured no random integ- 
ration of the reprogramming vector (Fig. 2f). The expression cassette 
was excised from the genome of 2E2 iPSC clone that harboured one 
random integration of the episomal vector by transient expression of 
Cre enzyme (Supplementary Fig. 4). 

EiPSCs had normal karyotypes and efficiently formed teratomas in 
B6 mice. However, the majority of teratomas derived from EiPSCs of 
both early and late passages showed apparent infiltration of T cells 
(Figs 3a, d and Supplementary Fig. 5). In addition, apparent tumour 
regression with extensive tissue necrosis was detected in 10% of 
teratomas formed by EiPSCs in B6 mice within 2 months of implanta- 
tion (Fig. 3b, c). No apparent tumour regression was observed in the 
majority of the teratomas formed by EiPSCs in B6 mice before they 
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Figure 2 | A new episomal approach to generate EiPSCs from B6 MEFs. 

a, Diagram of the episomal vector that expresses the four reprogramming 
factors (Oct4/Sox2/Nanog/Lin28) and puromycin resistance gene from one 
messenger RNA separated by IRES sequences. The entire expression cassette is 
flanked with LoxP sites. b, c, EiPSCs were positive for alkaline phosphatase 
(AP) and SSEA-1 (b) and expressed pluripotency genes to the same levels as 
those of B6 ESCs as determined by quantitative real-time PCR (c). The mRNA 
levels in MEFs are arbitrarily set to 1. d, EiPSCs form teratomas in both B6 mice 
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reached the allowed maximal size (Fig. 3c). Therefore, we concluded 
that cells derived from B6 EiPSCs can be immunogenic in B6 recipi- 
ents, but their overall immunogenicity is lower than the cells derived 
from B6 ViPSCs. 

To determine the generality of our conclusion, two independently 
generated integration-free B6 iPSC lines, which were reprogrammed 
from B6 MEFs with a plasmid vector expressing Oct4/Sox2/Myc/KIf4 
(ref. 7), were implanted into B6 mice. T cell infiltration was observed in 
most teratomas formed by these B6 iPSCs in B6 mice, some of which 
also exhibit tissue necrosis (Supplementary Fig. 6). In addition, a small 
fraction of teratomas had undergone apparent regression by 40 days 
after implantation. These findings support the conclusion that cells 
derived from iPSCs are immunogenic in syngeneic recipients. 

To understand the basis of this immunogenicity, the profile of gene 
expression in teratomas derived from B6 ESCs and EiPSCs revealed a 
number of genes overexpressed in teratomas derived from B6 EiPSCs 
(Supplementary Fig. 7a). Expression analysis of six regressing terato- 
mas formed by two independent B6 EiPSCs in B6 mice indicated that 9 
of the 23 tested genes (Lcelf, Spt1, Cyp3al1, Zg16, Lce3a, Chi3L4, Olrl1, 
Retn, Hormad1) were commonly overexpressed in these teratomas 
(Fig. 4a). Hormad1 has been identified.as a tumour antigen and Spt 
as a tissue-specific antigen*”. 

To test the possibility that theabnormal expression of these genes in 
teratomas derived from B6 iPSCs contributes to their immunogenicity 
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(top panel) and SCID mice (bottom panel). e, EiPSCs can contribute to adult 
chimaeric mice after injecting into the blastocysts derived from albino mice. 
f, Southern blotting analysis indicates no random integration of the episomal 
vectors in EiPSC clones 1E-12, 1E-13 and 3E-1. Clone 2E2 has one copy of the 
episomal vector integrated into the genome. Genomic DNA derived from 
iPSCs was digested with BamHI and hybridized to various probes that together 
cover the entire episomal vector. 
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Number — Ratio (%) 
Notumour 7/64 11 
Regressing tumour 6/64 9 
Growing tumour 51/64 80 


EiPSCs Day30 CDé4 infiltration Day 42 CD4 infiltration Day 56 CD4 infiltration 
1E12 = 11/18 (85) 3/6 (50) 6/7 (86) 2/3 (67) 4/5 (80) 3/4 (75) 
1E13 = 10/11 (91) 3/4 (75) 5/7 (71) 2/3 (67) 3/4 (75) 3/3 (100) 

2E2-12 5/7 (71) 2/3 (67) 4/5 (80) 2/2 (100) 4/5 (80) 3/3 (100) 


in B6 mice, seven such genes were ectopically expressed in B6 ESCs 
and their derived teratomas under the control of the ubiquitously 
active CAG promoter/enhancer (Supplementary Fig. 7b). Like B6 
ESCs, over 90% of implants of B6 ESCs with empty vector as well as 
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Figure 3 | Cells derived from B6 EiPSCs can be immunogenic in B6 mice. 
a, T-cell infiltration was detected in the majority of teratomas formed by B6 
EiPSCs in male B6 mice. 2E2-12 iPSCs is a subclone of 2E2 clone after LoxP/Cre- 
mediated deletion of the reprogramming factor expression cassette from the 
integrated copy of episomal vector. b, Tissue necrosis was detected in the 
regressing teratomas formed by B6 EiPSCs in male B6 mice. H&E, haematoxylin 
and eosin staining. c, Summary of teratoma formation by B6 EiPSCs in male B6 
mice. d, Summary of teratoma formation and CD4* T cell infiltration at 
different time points after implantation of EiPSCs in male B6 mice. 


transgenic Lcelf-B6 ESC and Retn-B6 ESCs formed teratomas in B6 
mice (Fig. 4b). In contrast, over 80% of Zg16-B6 ESC implants and 50% 
of Hormad1- or Cyp3a11-B6 ESC implants failed to form visible 
teratomas in B6 mice (Fig. 4b). Extensive T cell infiltration and wide- 
spread necrosis were detected in the teratomas formed by Zg16- and 
Hormad1-B6 ESCs in B6 mice but rarely detectable in the teratomas 
derived from Lce1f- and Retn-B6 ESCs in B6 mice (Fig. 4c, d). To rule 
out the possibility that the regression of the teratomas formed by Zg16- 
and Hormad1-B6 ESCs in B6 mice is secondary to the abnormal pro- 
liferation or cell death induced by the ectopic expression of these genes, 
the proliferation and survival of Zg16- and. Hormad1-B6 ESCs were 
identical to B6 ESCs (Supplementary Fig. 7c, e). In addition, the weight 
of the teratomas formed by Zg16- and Hormad1-B6 ESCs in SCID 
mice was similar to thatof B6.ESCs (Supplementary Fig. 7d). 

To identify the immune responses against the cells derived from 
iPSCs, we used\CD4/~ and\@D8 /~ B6 mice to examine the import- 
ance of T cells in the immune rejection. The robust immune rejection of 
the teratomas formed by B6 ViPSCs as well as Zg16- and Hormad1-B6 
ESCs in B6 mice was abolished in both CD4~‘~ and CD8~’~ B6 mice 


a Regressing tumour b Cell lines Day 30 (%) 
Gene name 1 2 3 4 °5 6 Vector-B6 17/18 (94.4) 
Spt? (NM009267) + + + ae daa ee ee 
Cyp3a171 (NM007818) + + + + + + Cyp3a11-B6 6/12 (25) 
Lce3a (NM001039594) + + + + + + Lce3a-B6 11/16 Bo 
Lce1f (NM026394) + + + + + + Spt Ae ae (23.8 
Zg16 (NM026918) + + + + + + cett | 
Hormad1 (NM026489) ‘A Sanaa metn-BG Bie tu 00) 
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2 Anti-lIgG Anti-CD3 Anti-CD4 Anti-CD8 
Bey a, . a; Ege e 


Zg16 = =Hofmad? Cyp3a11 


Leelf 


Retn 


Figure 4 | Abnormal overexpression of some proteins contributes directly 
to the immunogenicity of cells derived from B6 EiPSC in B6 mice. a, Nine 
genes were found to be commonly overexpressed in six regressing teratomas 
formed by two B6 EiPSCs. The expression of 23 genes identified as 
overexpressed in EiPSC-derived teratomas by microarray analysis was analysed 
by real-time PCR. b, Summary of teratoma formation by various transgenic B6 
ESCs in male B6 mice. c, Extensive infiltration of T cells in the teratomas 
formed by Cyp3a11-, Hormad1- and Zg16-B6 ESCs in B6 mice. Few infiltrating 
T cells were detectable in the teratomas formed by Lce1f- and Retn-B6 ESCs in 
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B6 mice. Representative images are shown. d, Extensive necrosis is present in 
teratomas formed by Cyp3a11-, Hormad1- and Zg16-B6 ESCs in B6 mice. 

e, The immune rejection of the teratomas formed by Hormad1-B6 ESCs, Zg16- 
B6 ESCs, B6 EiPSCs and B6 ViPSCs is abolished in CD4~/~ or CD8~/~ B6 
mice. f, IFN-y release assay to detect the presence of primed T cells specific for 
cells expressing Hormad1 and Zg16 in B6 mice harbouring the teratomas 
formed by EiPSCs. Each data point represents the mean of duplicate cultures. 
Consistent data are obtained from three independent experiments. 
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(Fig. 4e). In addition, no regression of teratomas formed by EiPSCs in 
CbD4~'~ or CD8~/~ B6 mice was observed. Therefore, both CD4* 
helper T cells and CD8* cytotoxic T cells are critical for this immune 
rejection. These findings also indicate that the innate immunity does 
not have an important role in the immune rejection of the cells derived 
from iPSCs. 

To further determine whether the abnormal expression of Hormad1 
and Zgl6 in teratomas formed by EiPSCs directly activates T-cell 
responses in B6 mice, we performed the IFN-y releasing assay that 
measures the antigen-specific activation of in-vivo-primed T cells’°. 
Dendritic cells purified from B6 mice were transfected with either 
empty expression vector or vectors expressing Zg16, Hormad1, Retn 
or EGFP (enhanced green fluorescent protein). LPS-matured dendritic 
cells expressing Hormad1 or Zg16 but not the dendritic cells expres- 
sing Retn or EGFP could induce IFN-y production from purified T 
cells, indicating the presence of primed T cells specific for cells expres- 
sing Hormad1 or Zg16 in B6 mice harbouring the teratomas formed by 
EiPSCs (Fig. 4f). Although these findings did not identify the specific 
peptides responsible for activating T cells, they demonstrate that the 
abnormal expression of Hormad1 and Zg16 contributes directly to the 
immunogenicity of the cells derived from EiPSCs in syngeneic recipi- 
ents. Hormad1 was also overexpressed in most teratomas formed by 
four independently generated integration-free iPSCs reprogrammed 
with adenoviral vectors, recombinant proteins or plasmid vectors”*?””. 
In addition, Zg16 was overexpressed in most teratomas formed by 
iPSCs reprogrammed with recombinant proteins. Therefore, the 
abnormal expression of such immunogenic proteins could represent 
a common mechanism to induce T cell-mediated immune responses 
to cells derived from iPSCs. 

Our findings indicate that some cells derived from iPSCs can be 
immunogenic in syngeneic recipients. The T-dependent immune res= 
ponse is likely due to the abnormal expression of antigens not expressed 
during normal development or differentiation of ESCs, leading to the 
break of peripheral tolerance. The expression of these minor antigens 
could be due to the subtle yet apparent epigenetic difference between 
iPSCs and ESCs'*®. In addition, recently discovered mutations in the 
coding sequences of iPSCs could also contribute to the immunogenicity 
of iPSC derivatives*’. Therefore, for the clinic development of iPSCs, 
current reprogramming technology needs to be optimized to minimize 
the epigenetic difference between iPSGsand ESCs. The in vivo immu- 
nogenicity test described here can provide a robust screening platform 
for improving the reprogramming technology. 


METHODS SUMMARY 

Mice. B6 mice and ESCs were purchased from The Jackson Laboratory. Only male 
mice were used in the transplantation studies. All animal experiments were per- 
formed in accordance with,relevant guidelines and regulations, and approved by 
the Institutional’ Animal Careand Use Committee (IACUC). 

iPSC generation and characterization. MEFs were isolated from B6 embryo as 
described”*. For ViPSG production, MEFs were transduced with retrovirus cocktail 
as described’. For EiPSCs generation, MEFs were transfected with the episomal 
vector expressing the reprogramming factors. The lack of random integration of 
the episomal vector was confirmed by Southern blotting analysis with a combina- 
tion of probes that cover the entire episomal vector. 

Interferon-y releasing assay. Dendritic cells from B6 mice were isolated and 
transfected with expression vectors. The transfected dendritic cells were matured 
by lipopolysaccharide (LPS) treatment for 12h. T cells were purified from the 
pooled spleens and lymph nodes of five B6 mice harbouring the teratomas formed 
by EiPSCs and co-cultured with LPS-matured dendritic cells. Supernatant were 
collected at indicated time point to determine the IFN-y levels. 
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Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Mice, cell culture and reprogramming episomal vector construction. C57BL/6 
(B6) inbred mouse strain and ESCs were purchased from The Jackson Laboratory. 
Only male mice were used in the transplantation studies of ESCs and iPSCs. All 
animal experiments were performed in accordance with relevant guidelines and 
regulations, and approved by the Institutional Animal Care and Use Committee 
(IACUC). The ESCs and iPSCs were grown on the feeder layer derived from B6 
MEFs under standard conditions. The full-length cDNA of Oct4, Sox2, Lin28 and 
Nanog was sequentially inserted downstream of the CAG promoter in the episo- 
mal vector, separated by IRES sequences (Fig. 2a). The fifth gene, the puromycin 
resistance gene, is at the 3’ end of this mRNA transcript, separated from Nanog 
cDNA by the IRES sequence. This episomal vector is denoted pCOSLNP (CAG- 
Oct4-Sox2-Lin28—Nanog-Puro). Two LoxP sites in the same orientation were 
inserted into the episomal vector flanking the entire expression cassette. 

iPSC generation and characterization. MEFs were isolated from B6 embryo as 
previously described”. For ViPSC production, MEFs were transduced with retro- 
virus cocktail expressing Oct4, Sox2, KIf4 with or without c-Myc. The iPSC colonies 
were picked 18 days after infection as described’. For EiPSCs generation, MEFs were 
transfected with pCOSLNP vector using Basic Nucleofector Kit for Primary 
Mammalian Fibroblasts (Lonza) followed by puromycin selection for 3 days, and 
then plated on irradiated B6 MEF feeders. Three weeks later, the culture was 
replated on fresh feeder cells. iPSC colonies were picked 10 to 30 days after replat- 
ing, and the lack of random integration of the episomal vector was confirmed by 
Southern blotting analysis with a combination of probes that cover the entire 
episomal vector. 

Quantitative real-time PCR analysis. Total RNA was purified from fibroblasts, 
ES cells, iPS cells and teratomas with a RNeasy total RNA isolation kit (Qiagen). 
Total RNA (1 yg) was reversely transcribed into cDNA, which was analysed by 
quantitative real time PCR analysis as previously described’*. The primers used 
were as follows: Oct4F, 5'’-GGCTCTCCCATGCATTCAA-3’; Oct4R, 5’-TTTA 
ACCCCAAAGCTCCAGG-3';  Sox2F, 5'-AAATCTCCGCAGCGAAACG-3’; 
Sox2R, 5'-CCCCAAAAAGAAGTCCCAAGA-3’; Lin28F, 5'-CTGCTGTAGC 
GTGATGGTTGA-3’;— Lin28R, = 5’-CCACCCAATGTGTTCTATTGCA-3’; 
NanogF, 5'-TCGCCATCACACTGACATGA-3’; NanogR, 5'-TGTGCAGAGCA 
TCTCAGTAGCA-3'; RexlF, 5’-ACGAGTGGCAGTTTCTTCTTGGGA-3'; 
Rex1R, 5'-TATGACTCACTTCCAGGGGGCACT-3’; Gdf3F, 5'-GATTGCIT 
TTTCTGCGGTCTGT-3’; Gdf3R, 5’-CCAAGTTCTTCAGTCGGTTGCT-3’. 
Primers used for detection of reprogramming factor deletion were as follows: 
Oct4F (43-63), 5'-CCTTCCTTCCCCATGGCGGGA-3'; IRESRI (53-31), 
5'-TTATTCCAAGCGGCTTCGGCCAG-3'; Sox2F (1292=1310), 5'-CCCCAG 
CAGACTTCACATGT-3’; IRES-R (221-202), 5’-AGGAACTGCTTCCTTCA 
CGA-3'; IRESF2 (476-498), 5'-TCGGTGCACATGCTTTACATGTG-3’; 
Lin28R (369-352), 5'-CCGGAACCCTTCCATGTG-3’; NanogRTGA (1131- 
1111), 5'-TCACACGTCTTCAGGTTGCAT-3’; P1, 5’-CGCCATCTTCTGAAG 
CTGAATC-3’; P2, 5'-ACCGAAAGGAGCGCACGACCOCAT-3’; P3, 5’-CCTA 
CTCAGACAATGCGATGCA-3’; «GAPDHEF, 5'-CCAGTATGACTCCACTCA 
CG-3'; GAPDHR, 5'-GACTCCACGACATACTCAGC-3'; LcelfF, 5'-CTGTA 
GCCTGGGTTCTGG-3’; LeelfR, 5’*GACGATGGCGACGAAGAG-3’; SptlF, 
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5'-TGAAACTCAGGCAGATAG-3’; Spt1R, 5’-TGTCAACGCCACTGTTCT-3’; 
OIr1F, 5’-TGGTGGTTCCCTGCTGCTA-3’; Olr1R, 5'-ATCCTGCTGAGTAAG 
GTTCG-3'; Zg16F, 5’-CATCACCGCCTTCCGTAT-3’; Zg16R, 5’-CGTTGAAA 
CTTGTGCCTGA-3’; RetnF, 5’-TCCTTGTCCCTGAACTGC-3’; RetnR, 5’-ACG 
AATGTCCCACGAGCC-3’; Hormad1F, 5'-CCAGATTACCAACCACCAG-3’; 
Hormad1R, 5'-TGAAAAGGTGTTGGGACT-3’; Lce3aF, 5’-GGCAGTGGTCA 
GCAGTCT-3’; Lce3aR, 5’-TTGGGAAATCCATTAGAAGA-3';  Cyp3al1F, 
5'-ATCCCATTGCTAATAGAC-3’; Cyp3al1R, 5’-ATCATCACTGTTGACCCT-3’; 
Chi3l4F, 5'-ATGGCTACACTGGAGAAA-3’; Chi3]4R, 5'-TGCTGGAAATCCC 
ACAAT-3'. 

Southern blotting analysis. Genomic DNA (10 ug) was digested with BamHI, 
separated on 1% agarose gel and transferred to a nylon membrane. For the analysis 
of ViPSCs, the membrane was hybridized to the Oct4 cDNA probe. For the 
analysis of EiPSCs, the membrane was hybridized to the cDNA probe of Oct4, 
Sox2, Lin28 and Nanog as well as the vector backbone probe. 

Teratoma formation and immunohistochemistry analysis. ESCs or iPSCs were 
collected, washed twice with PBS, and injected subcutaneously into the hind leg 
region of B6 or SCID mice. One or three million.cells were used for each injection. 
Tumours were measured and surgically removed fromthe euthanized mice at the 
indicated time point. Teratomas were fixed either with 4% formaldehyde or frozen 
in optimal cutting temperature (OCT) compound. Sections were stained with 
haematoxylin and eosin or withvarious,antibodies such as IgG, anti-CD4, anti- 
CD3 (BD Biosciences) as we described previously~. 

Microarray assay. Total RNA was purified from the teratomas with an RNeasy 
total RNA isolation kit (Qiagen). Microarray assay was performed by SeqWright 
using an Affymetrix Mouse 430A 2.0 chip. 

Flow cytometric analysis. About 5X 10° ESCs or iPSCs were stained for the 
expression of ESC-specific surface marker with anti-SSEA-1 antibody (Stemgents). 
Isotype-matched normal antibodies were used as negative controls. The stained cells 
were analysed by a BD LSR-II using FACSDiva software (Becton Dickinson) as we 
previously described”. 

Interferon-y releasing assay. To obtain dendritic cells from B6é mice with the 
dendritic cell purification kit (Miltenyi Biotec), bone marrow cells were isolated from 
B6 micéand grown in Petri dish at a density of 10° cells ml’ in complete medium 
Supplemented with 10 ng ml‘ granulocyte/macrophage colony-stimulating factor 
(GM-CSF) and 5ngml * IL-4 according to the manufacturer's recommendation. 
On day 9, dendritic cells were purified and transfected with expression vectors using a 
mouse dendritic cell Nucleofector kit according to the manufacturer’s instruction 
(Lonza). The transfected dendritic cells were matured by LPS treatment for 12 h. T 
cells were purified from the pooled spleens and lymph nodes of five B6 mice 
harbouring the teratomas formed by EiPSCs through negative selection using a 
pan T cell isolation kit (Miltenyi Biotec). Purified T cells (10°) were immediately 
co-cultured with LPS-matured dendritic cells (2 X 10°). Supernatant were col- 
lected at indicated time point to determine the IFN-y levels using an ELISA kit 
(Thermo Scientific). 


23. Song, H., Chung, S.-K. & Xu, Y. Modeling Disease in Human ESCs Using an Efficient 
BAC-Based Homologous Recombination System. Cell Stem Cell 6, 80-89 (2010). 
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Agonist-bound adenosine A», receptor structures 
reveal common features of GPCR activation 
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Adenosine receptors and f-adrenoceptors are G-protein-coupled 
receptors (GPCRs) that activate intracellular G proteins on bind- 
ing the agonists adenosine’ or noradrenaline’, respectively. GPCRs 
have similar structures consisting of seven transmembrane helices 
that contain well-conserved sequence motifs, indicating that they 
are probably activated by a common mechanism**. Recent struc- 
tures of f-adrenoceptors highlight residues in transmembrane 
region 5 that initially bind specifically to agonists rather than to 
antagonists, indicating that these residues have an important role 
in agonist-induced activation of receptors’’. Here we present two 
crystal structures of the thermostabilized human adenosine Aj, 
receptor (A,,R-GL31) bound to its endogenous agonist adenosine 
and the synthetic agonist NECA. The structures represent an inter- 
mediate conformation between the inactive and active states, 
because they share all the features of GPCRs that are thought to 
be in a fully activated state, except that the cytoplasmic end of 
transmembrane helix 6 partially occludes the G-protein-binding 
site. The adenine substituent of the agonists binds in a similar 
fashion to the chemically related region of the inverse agonist 
ZM241385 (ref. 8). Both agonists contain a ribose group, not found 
in ZM241385, which extends deep into the ligand-binding pocket 
where it makes polar interactions with conserved residues in H7 
(Ser 277’? and His278’**; superscripts refer to Ballesteros- 
Weinstein numbering’) and non-polar interactions with residues 
in H3. In contrast, the inverse agonist ZM241385 does not interact 
with any of these residues and comparison with the agonist-bound 
structures indicates that ZM241385 sterically prevents the confor- 
mational change in H5 and therefore it acts as an inverse agonist. 
Comparison of the agonist-bound structures of Az,R with the 
agonist-bound structures of f-adrenoceptors indicates that the 
contraction of the ligand-binding pocket caused by the inward 
motion of helices 3, 5 and 7 may be a common feature in the 
activation of all GPCRs. 

In the simplest model for the conformational dynamics of GPCRs’? 
there is an equilibrium between two states, R and R*. The inactive state R 
preferentially binds inverse agonists and the activated state R* preferen- 
tially binds agonists''. Only R* can couple and activate G proteins. 
Although there are far more complex schemes’’ describing intermedi- 
ates between R and R*, studies on rhodopsin have indicated that there is 
only one major conformational change that significantly alters the struc- 
ture of the receptor’. Thus the structures of dark-state rhodopsin’** and 
of opsin’*’° are considered to be representative structures for the R and 
R* state, respectively. Structures of six different GPCRs*?*"”~*" in con- 
formations closely approximating the R state have now been determined 
and it is clear that they are similar to each other, with root mean squared 
deviation (r.m.s.d.) between any pair of structures in the transmem- 
brane domains being less than 3 A. As observed in light activation of 
rhodopsin, the major structural difference between R and R* is the 
movement of the cytoplasmic ends of helices 5 and 6 away from the 
receptor core by 5-6 A, opening up a cleft in the centre of the helix 
bundle where the carboxy terminus of a G protein can bind’’. Recently, 


the structure of an agonist-bound B-adrenoceptor (f2-AR) was deter- 
mined in complex with an antibody fragment (nanobody Nb80)°. This 
structure of B,-AR is very similar to the structure of opsin, which indi- 
cates that the nanobody mimicked the action of a G protein by main- 
taining the receptor structure in an activated state. Given the structural 
similarities between opsin and the B.-AR-Nb80 complex, it is likely that 
the structures of the R* states of other GPCRs are also highly similar. 
This is consistent with the same heterotrimeric G proteins being able to 
couple to multiple different receptors””. However, it is unclear whether 
conserved structures of R and R* indicate that all agonists activate the 
receptors in an identical fashion. The recent structures of a thermosta- 
bilized §,-AR bound to four different agonists indicated that a defining 
feature of agonist binding to this receptor is the formation of a hydrogen 
bond with Ser**° on transmembrane helix 5 that accompanies the con- 
traction of the ligand-binding pocket’. Here we describe two structures 
of the adenosine Az, receptor (A24R) bound to two different agonists, 
which indicates that the initial action of agonist binding to Aj4R has 
both similarities and differences compared to agonist binding in B-ARs. 

The native human A,,R when bound to its endogenous agonist 
adenosine or to the high-affinity synthetic agonist NECA is unstable 
in detergent, so crystallization and structure determination relied on 
using a thermostabilized construct (A24R-GL31) that contained four 
point mutations, which markedly improved its thermostability. 
Pharmacological analysis showed that the mutant receptor bound 
the five antagonists tested with greatly reduced affinity (1.8-4.3 log 
units), whereas four agonists bound with similar affinity to the wild- 
type receptor (Supplementary Fig. 1). However, A,,4R-GL31 is only 
weakly activated by the agonist CGS21680 (Supplementary Fig. 2), 
which indicates that the thermostabilizing mutations might also 
decouple high-affinity agonist binding from the formation of R*. 
The conformation of A,,R-GL31 is not consistent with it being in 
the fully activated G-protein-coupled state, because we do not observe 
a 42-fold increase in affinity for NECA binding measured for G,.- 
coupled A,,R (ref. 23). These data all indicate that A,,R-GL31 is in 
an intermediate conformation between R and R%*, which is consistent 
with the structural analysis presented later. 

The two structures we have determined are of Az,R-GL31 bound to 
adenosine and NECA with resolutions of 3.0 A and 2.6 A, respectively 
(Supplementary Table 1). Global alignments of the Az ,4R-GL31 struc- 
tures with A,,-T4L (A,4R with T4 lysozyme inserted into inner loop 
3) bound to the inverse agonist ZM241385 were performed based on 
those residues in the region of the ligand-binding pocket that show the 
closest structural homology (Fig. 1 and Supplementary Text). This 
gave an r.m.s.d. in Ca positions of 0.66 A for the 96 atoms selected, 
which include all residues involved in binding either adenosine or 
NECA, with the exception of those in H3. Using this transformation, 
the adenine-like moiety of the two ligands superimposes almost 
exactly (r.m.s.d. 0.56 A). The most significant differences between 
the two structures are seen in a distortion and a 2A shift primarily 
along the helical axis of H3, a bulge in H5 (resulting from non-helical 
backbone conformation angles of residues Cys 185 and Val 186) that 
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Figure 1 | Structure of the adenosine A,4 receptor bound to NECA 
compared to other GPCR structures. a, The structure of NECA-bound Az,4R 
is shown as a cartoon (yellow) aligned with the structure of A,,4-T4L bound to 
the inverse agonist ZM241385 (blue; PDB code 3EML*). NECA is shown as a 
space-filling model (C, green; N, blue; O, red). b, c, Sections through the aligned 
receptors in a that highlight the differences in the intracellular face of the 


shifts residues into the binding pocket by up to 2 A and also a change in 
conformation of the cytoplasmic ends of H5, H6 and H7 (Fig. 1). 
Comparison of the A,4R-GL31 structure with the agonist-bound 
B.-AR-Nb80 complex indicates that these differences are similar to 
the conformational changes in the B,-AR that are proposed to be 
responsible for the formation of the R* state’. However, it is unlikely 
that the structure of A2.4R-GL31 represents the fully activated state, 
because comparison with opsin bound to the C-terminal peptide of the 
G protein transducin shows that there is insufficient space in Az,R- 
GL31 for the C terminus of the G protein to bind (Supplementary Fig. 3). 
This is on the basis of the assumption that all G proteins bind and 
activate GPCRs in a similar fashion, but given the highly conserved 
structures of both G proteins and GPCRs this seems a reasonable 
hypothesis. 

The fact that the structure of A2.,R-GL31 represents an agonist- 
binding state is consistent with how A,,R-GL31 was engineered. 
Thermostabilizing mutations were selected by heating the NECA- 
bound detergent-solubilised receptor, so the mutations are anticipated 
to stabilize the agonist-bound state either by stabilizing helix—helix 
interactions and/or biasing the conformational equilibrium between 
the agonist-bound R* state and the agonist-bound R state***®. The two 
most thermostabilizing mutations, L48A and Q89A, are in regions of 
the receptor that are involved in transitions between R and R%*, pro- 
viding a possible explanation for their thermostabilizing effect (Sup- 
plementary Fig. 4). The other two mutations, A54L and T65A, are at 
the receptor-lipid interface and the reason for their thermostabilizing 
effect is unclear. Although the overall shape of the ligand-binding 
pockets of A2,R and BAR are different, the structural similarities with 
the B,-AR-Nb80 (ref. 5) and the structural differences to ZM241385- 
bound A,,-T4L* indicate that the structure of the binding pocket in 
AzaR-GL31 is a good representation of the agonist-bound binding 
pocket of the wild-type receptor (Fig. 1). 

Adenosine and NECA bind to Aj,R-GL31 in a virtually identical 
fashion; in addition, the adenine ring in the agonists interacts with 
AyaR ina similar way to the chemically related triazolotriazine ring of 
the inverse agonist ZM241385 (Fig. 2). Thus the hydrogen bonds 
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receptors (b) and in the ligand-binding pocket (c), with the bulge in H5 shown 
as an inset. d, e, Alignment of NECA-bound A2,R (yellow) with agonist-bound 
B.-AR-Nb80 (red; PDB code 3P0G*) showing the intracellular face of the 
receptors (d) and the ligand-binding pocket (e). NECA is shown as a space- 
filling model in c and e. The figures were generated using CCP4mg”’. 
Analogous alignments to opsin are depicted in Supplementary Fig. 7. 


between exocyclic adenosine N6 (Supplementary Fig. 5) with both 
Glu 169 in extracellular loop 2 (EL2) and Asn 253°°° in H6 are similar, 
with the significant m-stacking interaction with Phe 168 in EL2 also 
conserved. One of the major structural differences between ZM241385 
and the agonists is the presence of a furan substituent on C20 of 
triazolotriazine in the inverse agonist, whereas agonists contain a 
ribose substituent linked to N9 of adenine (Fig. 2 and Supplemen- 
tary Fig. 5). In ZM241385, the furan group forms a hydrogen bond 
with Asn 253°>° in Hé6 and van der Waals contacts with other residues 
in H3, H5 and H6 (ref. 8). In contrast, the ribose moiety in agonists 
forms hydrogen bonds with Ser 277’ and His 278” in H7, in addi- 
tion to van der Waals interactions with other residues in H3 and H6 
(Fig. 2). In particular, Val 84° has to shift its position upon agonist 
binding owing to a steric clash with the ribose ring, which may con- 
tribute to the 2 A shift observed in H3 (Fig. 3). These differences in 
binding between ZM241385 and either adenosine or NECA indicate 
that the residues that bind uniquely to agonists (Ser277’*? and 
His 278”**) have a key role in the activation of the receptor, as previ- 
ously shown by mutagenesis studies*””*. This is analogous to the situ- 
ation in the activation of B,-AR, where only full agonists cause the 
rotamer conformation changes of Ser>*° in H5, whereas the inverse 
agonist ICI118551 prevents receptor activation by sterically blocking 
the rotamer change””’. However, the details of the activation differ in 
that the critical residues that bind agonists and not antagonists are in 
H5 in the B,-AR, but in H7 in the Az,R (Fig. 4). 

Adenosine and NECA activate the A,,R through interactions with 
H3 and H7 that are absent in the interactions between the receptor and 
the inverse agonist ZM241385 (Fig. 2). The inward shift of H7, the 
movement of H3 and the consequent formation ofa bulge in H5 are all 
observed in the structures of agonist-bound A,,R-GL31 and B,-AR- 
Nb80 (Fig. 1). The formation of the bulge in H5 of the B.-AR-Nb80 
structure was linked to a series of conformational changes that generate 
the 60° rotation of H6 about Phe 282°, resulting in the cytoplasmic 
end of H6 moving out from the receptor centre and opening the cleft 
where the C terminus of a G protein is predicted to bind as observed in 
opsin®’. There are analogous side-chain movements in Az,R-GL31 
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that result in a 40° rotation of H6, but the cytoplasmic end of H6 
remains partially occluding the G-protein-binding cleft (Supplemen- 
tary Fig. 3), perhaps because the fully active conformation requires 
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Figure 2 | Comparison of receptor-ligand 
interactions for A2,R bound to the inverse 
agonist ZM241385 and the agonists NECA and 
adenosine. a-c, Structures of the human A,,R in 
cartoon representation are shown bound to the 
following ligands: a, ZM241385 (PDB code 
3EML’); b, NECA; and c, adenosine. d, e, Polar and 
non-polar interactions involved in agonist binding 
to Aj,R are shown for NECA (d) and adenosine 
(e). Amino acid residues within 3.9 A of the ligands 
are depicted, with residues highlighted in blue 
making van der Waals contacts (blue rays) and 
residues highlighted in red making potential 
hydrogen bonds with favourable geometry (red 
dashed lines, as identified by HBPLUS, see 
Methods) or hydrogen bonds with unfavourable 
geometry (blue dashed lines, donor acceptor 
distance more than 3.6 A). Where the amino acid 
residue differs between the human A;,R and the 
human A,R, A>pR and A3R, the equivalent residue 
is shown highlighted in orange, purple or green, 
respectively. Panels a—c were generated using 
PyMOL (http://www.pymol.org/). Omit densities 
for the ligands are shown in Supplementary Fig. 6 
and densities for water molecules in 
Supplementary Fig. 8. 
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the binding of G proteins to stabilize it. Interestingly, the structure of 
B.-AR° with a covalently bound agonist is also not in the fully activated 
R* conformation, which is only seen after the nanobody Nb80 is 
bound’. The importance of the bulge in H5 in the activation of A,4R 
is highlighted by how inverse agonists bind. Formation of the H5 bulge 
results in the inward movement of Cys 185°*° (CB moves by 4 A), 
which in turn causes the movement of Val 186 and ultimately a shift 
of His 250° by 2 A into the ligand-binding pocket, thereby sterically 
blocking the binding of ZM241385 (Supplementary Fig. 4). Hence, 
when the inverse agonist binds, it is anticipated that the H5 bulge is 
unlikely to form owing to the opposite series of events and hence the 
formation of the R* state is inhibited. 

Thus, in both B-ARs and A> R, the formation of the H5 bulge seems 
to be a common action of agonists, whereas inverse agonists seem to 
prevent its formation. However, the energetic contributions to its 
formation may be different between the two receptors. In B-ARs there 
is a major contribution from direct interaction between the agonist and 
Ser®“°, whereas in A>4Rs, the major interaction seems to come from 


Figure 3 | Positions of adenosine and ZM241385 in the Aj 4R ligand- 
binding pocket. The structures of adenosine-bound Aj,R-GL31 and 
ZM241385-bound A,,-T4L were aligned using only atoms from the protein to 
allow the ligand positions to be compared, with adenosine in yellow and 
ZM241385 in pink (N, blue; O, red). The ligands are shown in the context of the 
binding pocket of A,,R-GL31, with transmembrane helices of Az ,4R-GL31 
shown in yellow and the surfaces of the receptor, including the cavity of the 
ligand binding pocket, shown in grey. The side chains of Val 84 and Leu 85 that 
interact with the ribose moiety of the agonist are shown in green. 
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Figure 4 | Comparison of the positions of agonists in the binding pockets of 
the A,,R and B,-AR. a, The structures of the A,,R bound to adenosine and the 
B,-AR bound to isoprenaline (PDB code 2Y03)’ were aligned by superimposing 
equivalent atoms in the protein structure and the positions of both ligands are 
shown as stick models with the carbon atoms in blue/green (isoprenaline) or 
yellow (adenosine); N, blue; O, red. The A> qR structure is shown, with H5 and 
H7 as space-filling models (C, grey; N, blue; O, red) and the remainder of the 
structure as a cartoon (pale green). Some water molecules are shown as red 
spheres, hydrogen bonds as red dashed lines and the polar contacts as blue 
dashed lines. The orientation of the figure is identical to that shown in Fig. 2. 
b, Structure of the Aj,R bound to adenosine viewed from the extracellular 
surface. c, Structure of §;-AR bound to isoprenaline (PDB code 2Y03)’ viewed 
from the extracellular surface. In panels b and c, equivalent side chains in the 
respective structures that make contacts to both isoprenaline and adenosine in 
their respective receptors are shown as space-filling models and they have the 
following Ballesteros—-Weinstein numbers (amino acid side chains are shown in 
parentheses for the A,R and f-AR, respectively): 3.32 (V84, D121); 3.36 (T88, 
V125); 5.42 (N181*, S211); 6.51 (L249, F306); 6.55 (N253, N310); 6.52 (H250*, 
F307); 7.39 (1274, N329); 7.43 (H278, Y333). An asterisk indicates residues that 
only make indirect contacts to the agonists via a water molecule. 


4 | NATURE | VOL 000 | 00 MONTH 2011 


interactions between the agonist and H3, combined with polar inter- 
actions involving residues in H7. Despite these differences, agonist 
binding to both receptors involves strong attractive non-covalent 
interactions that pull the extracellular ends of H3, H5 and H7 together, 
which is the necessary prerequisite to receptor activation. 

While this manuscript was in review, a related manuscript 
appeared”, describing the structure of the A24-T4L chimaera bound 
to the agonist UK432097, which is identical to NECA except for two 
large substituents on the adenine ring. The structure of UK432097- 
bound A,4-T4L is very similar to the structures presented here in the 
transmembrane regions (r.m.s.d. 0.6 A), although there are differences 
in the extracellular surface due to the bulky extensions of UK432097 
interacting with the extracellular loops and the absence of density for 
residues 149-157. Xu et al.*° conclude that the structure of UK432097- 
bound Az4-T4L is in an “active state configuration”, whereas we con- 
clude that the NECA- and adenosine-bound structures are best 
defined as representing an intermediate state between R and R*. 


METHODS SUMMARY 


Expression, purification and crystallization. The thermostabilized Aj ,R-GL31 
construct contains amino acid residues 1-316 of the human A,,R, four thermo- 
stabilizing point mutations (L48A°*°, A541?*?, T65A*° and Q89A**’) and the 
mutation N154A to remove a potential N-glycosylation site. A,,4R-GL31 was 
expressed in insect cells using the baculovirus expression system and purified in 
the detergent octylthioglucoside using Ni’ *-NTA affinity chromatography and size 
exclusion chromatography (see Methods). The purified receptor was crystallized in 
the presence of cholesteryl hemisuccinate by vapour diffusion, under conditions 
described in Methods. 

Data collection, structure solution and refinement. Diffraction data were 
collected in multiple wedges (20° per wedge) from a single cryo-cooled crystal 
(100K) for the GL31-NECA complex at beamline ID23-2 at the European 
Synchrotron Radiation Facility and from four crystals for the GL31-adenosine 
complex, at beamline 124 at the Diamond Light Source. The structures were solved 
by molecular replacement using the ZM241385-bound A,,-T4L structure (PDB 
code 3EML)* as a model (see Methods). Data collection and refinement statistics 
are presented in Supplementary Table 1 and omit densities for the ligands are 
shown in Supplementary Fig. 6. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 

Expression, purification and crystallization. The human A,, construct, GL31, 
contains four thermostabilizing point mutations (L48A?°, A54L7°?, T65A7® and 
Q89A**”), the mutation N154A to remove the potential N-glycosylation site anda 
truncation at the C terminus after Ala 316 (ref. 32). A polyhistidine tag (His, 9) was 
engineered at the C terminus, separated from the receptor by a TEV protease 
cleavage site. Baculovirus expression and membrane preparation were performed 
as described previously for the B,-AR”’. 

Membranes were thawed at room temperature (20-25 °C), diluted with 25 mM 
HEPES pH 7.4, in the presence of protease inhibitors (Complete; Boehringer). 
Membranes were pre-incubated with NECA at 100 1M for 45 min at room temper- 
ature. The receptor-ligand complexes were then solubilised by adding decylmalto- 
side (DM) and NaCl to give final concentrations of 1.5% and 0.3 M, respectively, 
stirred for 30 min (4°C) and insoluble material removed by ultracentrifugation 
(120,000g, 45min, 4°C). All protein purification steps were performed at 4°C. 
The solubilised receptor sample was filtered through a 0.22 um filter (Millipore) 
and applied to a 5 ml Ni-NTA superflow cartridge (Qiagen) pre-equilibrated with 
buffer (25mM HEPES, pH 7.4, 0.1M NaCl, 100 uM NECA, 0.15% DM, 2.5mM 
imidazole). The column was washed sequentially with the same buffer supplemented 
with either 10, 40 or 80 mM imidazole, and then eluted with 250 mM imidazole. The 
eluted receptor-ligand complex was mixed with His,-tagged TEV protease to cleave 
the tag for 4-6 h, 4 °C, concentrated to 2 ml using an Amicon-ultra spin concentrator 
(Ultracel-50K, Millipore) and then imidazole was removed using a PD-10 column 
(GE Healthcare). Eluted fractions were further purified by binding the TEV and other 
contaminants to Ni-NTA (QIAGEN) pre-equilibrated in 25 mM HEPES pH 7.4, 
0.1M NaCl, 100 tM NECA, 0.15% DM, 40 mM imidazole, incubating for 30 min 
and then collecting the flow-through. For detergent exchange into 0.35% octylthio- 
glucoside (OTG), the sample was concentrated using an Amicon-ultra concentrator 
(Ultracel-50K, Millipore), diluted tenfold in 25 mM HEPES pH 7.4, 0.1M NaCl, 
100 uM NECA, 0.35% OTG, and concentrated again to 0.3 ml. The protein sample 
was applied to a Superdex 200 10/300 GL size-exclusion column pre-equilibrated in 
25mM HEPES pH 7.4, 0.1 M NaCl, 100 pM NECA, 0.35% OTG and run at 0.5 ml 
min |. Eluted receptor fractions (2-2.5 ml) were concentrated to 50-60 ul. Protein 
determination was performed using the amido black™ assay. 

Before crystallization, cholesteryl hemisuccinate (CHS) and OTG were added to 

1mgml' and 0.5% respectively and the protein concentration adjusted to 
10-12.5 mgml_'. NECA and adenosine Aj,-GL31 crystal hits were obtained 
using a new PEG-based crystallization screen developed in house”. Crystals were 
grown at 4 °C in 100 nl sitting drops using 0.05 M ADA NaOH, pH 6.4, 23.6% PEG 
400, 4% v/v 2-propanol for the NECA complex. Crystals were cryoprotected by 
soaking in 0.05 M ADA NaOH, pH 6.4, 45% PEG 400. For the adenosine complex, 
crystals were initially grown in 0.05 M TrisHCl, pH 7.6, 9.6% PEG 200, 22.9%. PEG 
300. Crystals were cryoprotected by soaking in 0.05 M TrisHCl, pH 7.5, 15% PEG 
200, 30% PEG 300. The crystals were mounted on Hampton CrystalCap HT loops 
and cryo-cooled in liquid nitrogen. 
Data collection, structure solution and refinement. Diffraction data for the 
NECA complex were collected at the European Synchrotron Radiation Facility with 
a Mar 225 CCD detector on the microfocus beamline ID23-2 (wavelength, 0.8726 A) 
using a 10 jum focused beam and for the adenosine complex on beamline 124 at the 
Diamond Light Source with a Pilatus 6M detector and a 10 jum microfocus beam 
(wavelength 0.9778 A). The microfocus beam was essential for the location of the 
best diffracting parts of single crystals, as well as allowing several wedges to be 
collected from different positions. Images were processed with MOSFLM”* and 
SCALA*’. The NECA complex was solved by molecular replacement with 
PHASER* using the A,,-T4L structure (PDB code 3EML)* as a model after removal 
of the coordinates for TAL, all solvent molecules and the inverse agonist ZM241384. 
This structure was then used as a starting model for the structure solution of the 
adenosine complex. Refinement and rebuilding were carried out with REFMAC5° 
and COOT™, respectively. In the final models, 98.1% of residues were in the 
favoured region of the Ramachandran plot with one outlier for the NECA complex, 
and 97.7% with no outliers for the adenosine complex. Smile strings for NECA and 
adenosine were created using Sketcher and dictionary entries using Libcheck. 
Hydrogen bond assignments for the ligands were determined using HBPLUS”. 

To facilitate a structural comparison between ZM241385-bound A>,-T4L and the 
thermostabilized A, 4-GL31 with bound agonist, the structures were superimposed 
based on those residues in the region of the ligand-binding pocket that show the 
closest structural homology. This was achieved using the Isq_improve option of 
program O” and an initial transformation based on residues at the C terminus of 
helix 6 and the N terminus of helix 7. The final superposition, based on residues 16-21 
in H1, 51-70 in H2 and ECLI, 132-140 in H4 and ECL2, 142-146 in ECL2, 166-182 
in ECL2 and H5 and 245-283 in H6, ECL3 and H7, gave an r.m.s.d. in Ca positions 
of 0.66 A for the 96 atoms and includes almost all residues involved in binding 
either ligand with the exception of those in H3. Using this transformation, the 


adenine moiety of the agonist superimposes well with the equivalent atoms of the 
triazolotriazene bicyclic ring of ZM241385 (r.m.s.d. 0.56 A). Validation of the final 
refined models was carried out using Molprobity*’. Omit densities for the ligands 
are shown in Supplementary Fig. 6. All figures in the manuscript were generated 
using either Pymol (DeLano Scientific) or CCPmg”’. 

Binding of agonists and antagonist to A2,R-GL31 expressed in CHO cells. 
Chinese hamster ovary (CHO) cells were maintained in culture in DMEM HAMs 
F12 media containing 10% FBS. Cells were transfected with plasmids expressing 
either wild-type adenosine A24R or A2,R-GL31 using GeneJuice according to 
manufacturer’s instructions (EMD Biosciences). Forty-eight hours after transfec- 
tion, cells were harvested, centrifuged at 200g for 5 min at 4°C and the pellet re- 
suspended in 20mM HEPES, 10mM EDTA buffer (pH 7.4). The membrane 
suspension was homogenized and centrifuged at 200g for 15 min at 4°C. The 
supernatant was collected, the pellet re-suspended in 20mM HEPES, 10mM 
EDTA (pH 7.4) buffer and the solution homogenized and centrifuged as described 
previously**. The collected supernatant was centrifuged for 30 min at 40,000g at 
4 °C. Pellets were re-suspended in 20mM HEPES, 0.1mM EDTA to a protein 
concentration of 1mgml ’ and stored at —80°C. 

Membranes from CHO cells transiently expressing wild-type or Az,R-GL31 

(10-15 pig per well) were assessed using competition ["H]NECA binding in buffer 
containing 50 mM Tris-HCl (pH 7.4) as described previously“*. Inhibition curves 
were fitted to a four-parameter logistic equation to determine ICso values, which 
were converted to K; values using Ky values determined by saturation binding and 
the [PHJNECA concentration of 10 nM. 
G-protein-coupling activity of A2,R-GL31 measured in whole cells. A24R-His¢ 
and Az4R-GL31-His, (amino acid residues 1-316 of human A2,R) were subcloned 
into plasmid pcDNA5/FRT/TO using KpnI and NotI restriction sites. Flp-in T-Rex 
HEK293 cells were maintained at 37 °C in a humidified atmosphere in Dulbecco’s 
modified Eagle’s medium without sodium pyruvate, supplemented with 4,500 mg 
1” glucose, L-glutamine, 10% (v/v) FBS, 1% penicillin/streptomycin mixture and 
10g ml blasticidin. To generate stable cell lines, the cells were transfected with a 
ratio of 1:9 receptor CDNA in pcDNA5/FRT/TO vector and pOG44 vector using 
Genejuice as per manufacturer’s instructions (EMD Biosciences). After 48h, 
media were replaced with fresh medium supplemented with 200 1g ml“! hygro- 
mycin B to select for stably expressing clones. Colonies were combined and tested 
for doxycycline-induced receptor expression. To induce receptor expression 
clones were treated with either 1ngml' or 3ngml ' doxycyline for 16h. 

Cells were seeded at a density of 25,000 per well in a poly-L-lysine coated 96-well 
halfarea plate. Cells were induced with doxycyline (3 or 1 ng ml *) for 16h. After 16h 
media were removed and replaced with fresh media containing 100 1M Ro-201724 
and 2 Uml ' adenosine deaminase. Cells were incubated at 37 °C for 30 min before 
addition of varying concentrations of agonist (25 °C, 30 min). As a control cells were 
also incubated for 30 min (25 °C) with 10 uM forskolin. Cells were then lysed and 
cAMP produced detected using the CisBio cAMP kit according to manufacturer’s 
instructions before plates were read on a PolarStar fluorescence plate reader. 
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Principles of activation and permeation 
in an anion-selective Cys-loop receptor 


Ryan E. Hibbs! & Eric Gouaux!? 


Fast inhibitory neurotransmission is essential for nervous system function and is mediated by binding of inhibitory 
neurotransmitters to receptors of the Cys-loop family embedded in the membranes of neurons. Neurotransmitter 
binding triggers a conformational change in the receptor, opening an intrinsic chloride channel and thereby 
dampening neuronal excitability. Here we present the first three-dimensional structure, to our knowledge, of an 
inhibitory anion-selective Cys-loop receptor, the homopentameric Caenorhabditis elegans glutamate-gated chloride 
channel a (GluCl), at 3.3 A resolution. The X-ray structure of the GluCl-Fab complex was determined with the allosteric 
agonist ivermectin and in additional structures with the endogenous neurotransmitter L-glutamate and the 
open-channel blocker picrotoxin. Ivermectin, used to treat river blindness, binds in the transmembrane domain of 
the receptor and stabilizes an open-pore conformation. Glutamate binds in the classical agonist site at subunit interfaces, 
and picrotoxin directly occludes the pore near its cytosolic base. GluCl provides a framework for understanding 
mechanisms of fast inhibitory neurotransmission and allosteric modulation of Cys-loop receptors. 


Fast inhibitory neurotransmission modulates both the magnitude and 
duration of neuronal activity, occurs on a timescale of milliseconds, and 
involves the release of inhibitory neurotransmitters into the synapse 
and activation of the cognate ligand-gated ion channels. As demon- 
strated nearly 60 years ago’, fast inhibitory neurotransmission leads to 
an increase in the permeability of the cell membrane to chloride, the 
most abundant biological anion. Because the membrane potential at 
which chloride is at equilibrium is near the neuronal resting potential, 
neurotransmitter-gated, chloride-selective ion channels generally 
oppose normal excitability and repolarize the cell’. 

The neurotransmitter receptors that directly mediate chloride per- 
meability constitute one half of the Cys-loop receptor family*. Receptors 
in this family are composed of five either identical or homologous 
subunits, which generate diversity in functional profiles and pharmaco- 
logical preferences. Cys-loop receptors fall into two broad categories. 
The cation-selective members are the nicotinic acetylcholine (nAChR) 
and serotonin 5-HT; receptors. Those selective for anions include the 
-aminobutyric acid (GABAag,c), glycine receptors and invertebrate 
glutamate-gated chloride channels (GluCl)**. So far, there is no struc- 
tural information for an anion-selective Cys-loop receptor, and the 
mechanism by which chloride is selected remains unclear. 

Ligand-gated chloride channels are critical not only for maintaining 
appropriate neuronal activity, but have long been important therapeutic 
targets: benzodiazepines, barbiturates, some intravenous and volatile 
anaesthetics, alcohol, strychnine, picrotoxin and ivermectin all derive 
their biological activity from acting on the inhibitory half of the Cys-loop 
receptor family*’. Of note is that many of the therapeutically useful 
compounds acting at Cys-loop receptors target an allosteric site. The 
sites in Cys-loop receptors at which these allosteric ligands bind and 
their structure-based mechanisms of action are largely unresolved. 


Crystallization of GluCl-Fab complex 

We identified the Caenorhabditis elegans GluCla glutamate-gated 
chloride channel as a promising candidate using fluorescence-detection 
size-exclusion chromatography (FSEC)’. In comparison to human 


Cys-loop receptors, GluCla is most similar to the «1 glycine receptor, 
with which it shares 34% amino acid sequence identity (see alignment 
in Supplementary Fig. 1). Optimization of the receptor construct for 
crystallization (GluCl.,y<) was guided by FSEC analysis and required 
deletion of 41 residues from the amino terminus, 6 residues from the 
carboxy terminus and replacement of the M3-Mé4 loop (Lys 345- 
Lys 402) with an Ala-Gly-Thr tripeptide. Well-ordered crystals dif- 
fracting to ~3.3 A resolution required co-crystallization of GluClayst 
as a complex with a Fab, ivermectin and lipids (Supplementary Fig. 2). 
Structures with agonist or channel blocker at 3.3 and 3.4A were 
obtained by soaking GluCl.,,...-Fab-ivermectin crystals with glutam- 
ate or picrotoxin, respectively. The electron density maps are of high 
quality, thus enabling the positioning of almost all receptor residues 
and refinement to satisfactory crystallographic residuals and stereo- 
chemistry (Supplementary Table 1 and Supplementary Fig. 3). 


Architecture 


The GluCl.ys;-Fab complex forms a pinwheel shape comprising a 
cylindrical homopentamer of GluCl.,,¢ subunits with Fab molecules 
bound at each subunit interface (Fig. 1a, b). Each GluClays, subunit 
consists of a large N-terminal extracellular domain of mostly B struc- 
ture, followed by four «-helical transmembrane spans (M1-M4; 
Fig. 1c). The overall architecture of the extracellular domain is similar 
to that found in the bacterial receptor orthologues from Gloeobacter 
violaceus (GLIC)'°" and Erwinia chrysanthemi (ELIC)’’. There is an 
additional helix at the N terminus reminiscent of the acetylcholine- 
binding protein’? ’° (AChBP) and Torpedo marmorata nAChR" 
structures. Significantly, GluCl contains the Cys-loop disulphide 
strictly conserved in eukaryotes as well as a disulphide bond in loop 
C present in glycine receptors (Fig. 1c). The transmembrane helices 
adopt a fold like the bacterial receptors and nAChR, with the five M2 
segments lining the pore and adopting an open channel conforma- 
tion, akin to the conformation visualized in the GLIC structures. 

To understand the molecular principles of ion channel activation, 
agonist binding and ion channel permeation and block, we determined 
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Figure 1 | Architecture of the GluCl,,y¢-Fab complex. a, View of the 
GluCl,,y;-Fab complex looking down pore axis towards cytosol. Fab molecules 
(cyan) are bound at each GluCl.,,¢ subunit interface. b, View parallel to lipid 
membrane; only two Fab molecules are shown for clarity. The ligands 
ivermectin and glutamate are represented as spheres with carbon atoms in 
yellow, oxygen in red and nitrogen in blue. Const., constant region; Var., 
variable region. c, A single GluCl.,._ subunit from two angles, approximate 
orientation as in panel b. The Cys loop and loop C disulphide bonds are shown 
as spheres, N and C termini and transmembrane spans are indicated. Loops of 
particular relevance to agonist binding and allosteric gating linkage are also 
indicated. 


separate crystal structures with the allosteric agonist ivermectin, and 
with ivermectin and glutamate, picrotoxin or iodide. Ivermectin is 
bound at each of the GluCl.,y.¢ subunit interfaces in the transmem- 
brane domain whereas glutamate electron density is present in all five 
of the classical neurotransmitter-binding sites in the extracellular 


2 | NATURE | VOL 000 | 00 MONTH 2011 


domain. Anomalous difference density for iodide is present in sites 
at the base of the transmembrane pore in a region important for ion 
selectivity, and a chloride ion was fit into non-protein electron density 
in the ion channel pore adjacent to the binding site for picrotoxin. 


Allosteric activation and modulation 


Ivermectin is a semi-synthetic macrocyclic lactone and broad-spectrum 
antiparasitic agent, widely used to treat river blindness in humans and 
parasitic infections in animals'””*. It achieves its margin of therapeutic 
efficacy by activating invertebrate glutamate-gated chloride channels at 
nanomolar concentrations*’, yet it also manifests activating and 
potentiating activities on vertebrate Cys-loop receptors*”** and on 
P2X ATP-gated ion channels” at higher concentrations. Ivermectin 
potently activates GluClo (Supplementary Fig. 4) while simultaneously 
rendering the receptor susceptible to further activation by glutamate”. 
Hence, at GluCla, we deem ivermectin a partial allosteric agonist. 

Ivermectin binds at subunit interfaces on the periphery of the trans- 
membrane domains, proximal to the extracellular side of the mem- 
brane bilayer (Fig. 2a, b and Supplementary Figs 5-7). Wedged 
between the M3 o-helix on the principal or (+) subunit and the M1 
o-helix on the complementary or (—) subunit, ivermectin inserts 
deeply into the subunit interface and makes important contacts with 
the M2 (+) pore-lining « helix and the M2-—M3 loop. Its site occupies 
approximately two turns of helix on the M1 and M3 helices and 
centres on a single turn of m helix between residues Leu217 and 
Ile 222 on M1, as illustrated by a hydrogen bond between the main- 
chain carbonyl oxygen of Leu 218 and a tertiary hydroxyl on ivermec- 
tin (Fig. 2c). Through extensive hydrophobic interactions and one 
hydrogen bond with each of the M1, M2 and M3 a-helices, ivermectin 
buries 278 and 254 A? of surface area on the (+) and (—) subunits in 
the interface, respectively. 

Ser 260 forms a hydrogen bond with the secondary hydroxyl group 
on the deeply buried cyclohexene ring of ivermectin (Fig. 2a—c and 
Supplementary Fig. 8). A serine residue in this position is correlated 
with direct activation by ivermectin in other Cys-loop receptors. 
Glycine and GABA g receptors have a serine in the equivalent position 
and are directly activated by ivermectin*®”', yet there is no similar 
serine in “7 nAChRs, where ivermectin is a positive allosteric modu- 
lator but does not directly activate”’, nor in GluClf receptors, where 
ivermectin has no activity’. The equivalent position is critical for 
GABAg receptor modulation by alcohol’, anticonvulsants, anaes- 
thetics and diuretics; glycine and 5-HT3 receptor modulation by 
anaesthetics®; and «7 nAChR modulation by additional compounds”. 
Hence, the ivermectin binding site in GluCl.,y. is shared, at least in 
part, by many important modulators of Cys-loop receptors. In GluCl 
we suspect that the interaction of ivermectin with the pore-lining M2 
helix increases both its affinity for the receptor and its ability to 
stabilize the open state. 

Ivermectin binding to GluCl probably results in two types of con- 
formational changes: first, a local distortion of the receptor in the 
vicinity of the binding site; and second, a global conformational 
change of the receptor that corresponds to a transition from a closed, 
resting state to an open, activated state. Because we lack a structure of 
GluCl..yst in the absence of ivermectin, GLIC provides a reference for 
gauging the local structural consequences of ivermectin binding to the 
transmembrane domain of the receptor. In comparing these two 
structures we find that the binding of ivermectin increases the sepa- 
ration between M1 and M3 of adjacent subunits, as defined by a 9.4 A 
spacing between GluCl.rys¢ Leu 218 and Gly 281 Ca atoms compared 
to a 6.4 A spacing for the corresponding atoms in GLIC. This splaying 
apart of the transmembrane helices in GluCl.ys occurs at the level ofa 
strictly conserved proline residue in M1 that forms the C-terminal 
end of the short 7 helix (Supplementary Fig. 9). 

We hypothesize that the global conformational change induced by 
ivermectin binding is rooted in the splaying apart of the M1 and M3 
helices and the movement of the apical portion of M2 away from the 
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Figure 2 | Ivermectin-binding site and atomic interactions. a, b, Two 
orientations of a GluCl subunit interface focusing on ivermectin-binding site. 
Dashed lines indicate hydrogen bonds. In a, view is from receptor periphery 
looking parallel to the membrane, and in b looking down pore from 
extracellular side with the extracellular domain removed for clarity. c, Chemical 
structure of ivermectin with interactions indicated. VDW, van der Waals. 
Atomic numbering is from PDB file. 


pore axis, towards the periphery of the receptor, opening an ion 
conductive pathway. This open-pore conformation of M2 is then 
stabilized through interactions between ivermectin and the apical 
end of M2. In addition, ivermectin may stabilize the open state of 
the ion channel through contacts between the disaccharide moiety 
and Ile 273 in the M2-M3 loop (Supplementary Fig. 10). 


ARTICLE 


Neurotransmitter-binding site 


The ion channel of GluCl ys is activated by glutamate only after activa- 
tion by ivermectin (Supplementary Fig. 11), in a manner similar to full- 
length GluClo (ref. 24). The homomeric GluClB and heteromeric 
GluCloB receptors, by contrast, are directly activated by glutamate®. In 
the context of GluC].,y., micromolar concentrations of glutamate aug- 
ment ivermectin-induced currents by 30-70%, similar to that of full- 
length receptor. PH] -L-glutamate binds directly to the GluClayst receptor 
with a dissociation constant (Ky) of 680 nM (Supplementary Figs 11-13). 
In agreement with the electrophysiology experiments, [*H]-1-glutamate 
binding requires ivermectin. To understand the molecular basis of glu- 
tamate binding we determined the structure of GluCl ys in the presence 
of ivermectin and glutamate. Sausage-shaped electron density assigned to 
glutamate was ~8o in F, — F, omit maps in all five of the classical agonist- 
binding sites. Omit electron density maps subjected to real space, five-fold 
averaging showed a protrusion in the electron density ‘sausage’ that we 
attributed to the o-amino group of glutamate (Supplementary Fig. 14). 

Glutamate binds in the classical neurotransmitter site in the extra- 
cellular domain*”’, lodged between subunits and nearly inaccessible to 
solvent (Fig. 3a, b). The architecture of the site is box-like, with loops 
from the (+) subunit forming ‘sides’ of the binding site and the B 
strands on the (—) subunit defining the ‘base.’ Loop C, postulated to 
have a critical role in allosteric activation’***°, adopts a closed con- 
formation consistent with AChBP structures bound by agonists. 
Functional groups on glutamate bridge the (+) and (—) subunits with 
the o-substituents snugly sandwiched between Tyr 151 and Tyr 200 on 
the (+) subunit, and positively charged residues, including Arg 37 
from a region important for conotoxin-nAChR interaction*', and 
Arg 56 on the (—) subunit, making contacts with the «- and y-carbox- 
ylate groups. These arginine residues, in combination with neighbour- 
ing cationic amino acids, provide the binding pocket with a strongly 
positive electrostatic potential (Supplementary Fig. 15). The «-amino 
nitrogen of glutamate is stabilized through a 3.8 A cation-r interaction 
with Tyr 200 on loop C, a hydrogen bond with the backbone carbonyl 
oxygen of Ser 150 and a close interaction with the backbone carbonyl 
oxygen of Tyr 151. A comparison of the determinants of glutamate 
binding with the corresponding residues in the AChBPs and other 
receptors is made in Supplementary Fig. 16. 

To test the sensitivity of the glutamate binding site to perturbations 
in ligand structure, we screened glutamate analogues for competition 
with [°Hj-1-glutamate bound to the ivermectin-complexed receptor 
(Fig. 3c and Supplementary Fig. 17). L-Glutamate bound much tighter 
than L-homocysteine sulphinic acid, which differs only in replace- 
ment of Cd with sulphur. Extending the side-chain length with an 
extra carbon (L-amino adipic acid) or shortening it (L-aspartate) 
resulted in a further drop in affinity, and changing the stereochemistry 
(D-glutamate) or removing the side-chain negative charge but not its 
ability to hydrogen bond (L-glutamine) decreased binding further. 
Thus, the GluCl neurotransmitter-binding pocket is selective for small 
dicarboxylate L-amino acids, consistent with the constellation of 
atomic interactions between agonist and receptor (Fig. 3a, b). 

Upon the binding of glutamate the side chain of Arg 56 in loop D 
(B2) shifts by ~0.5 A to accommodate the agonist and Tyr 200 in loop 
C repositions by ~0.5 A closer to the ligand, small yet significant con- 
formational changes consistent with movements of loops C and D in 
the agonist-induced activation of the receptor. These residues, together 
with Arg 37 (81), are located on elements of protein structure directly 
connected to the ion channel pore. We suggest that ivermectin, a 
partial allosteric agonist, stabilizes an ‘activated’ conformation of the 
agonist site and that binding of glutamate to this ‘activated’ site further 
stabilizes the open state of the receptor, increasing chloride conduc- 
tance. Ivermectin may transduce a conformational change to the neu- 
rotransmitter site through its interactions with the M2-M3 loop, 
located at the structural nexus of three extracellular domain loops 
central to allosteric communication between the neurotransmitter site 
in the extracellular domain and the transmembrane pore: the Cys, 
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Figure 3 | Glutamate-binding site and specificity. a, View from extracellular 
side towards membrane at glutamate in binding site in subunit interface. 

b, View of binding site looking parallel to membrane with loop C removed for 
clarity. Dashed lines with distances in A indicate hydrogen bonding and, in the 
case of Tyr 200, cation-7 interactions. Unless a range is given, distances are an 
average from the five binding sites. c, Radioligand competition experiments 
with L-glutamate and congeners against 1 mM [*H)-L-glutamate. Calculated 
inhibition constant (K;) values assume a Kg for [*H]- L-glutamate of 680 nM 
and are shown in inset table. n = 2. CI, confidence interval. L-HCS and L-AA are 
L-homocysteine sulphinic acid and L-amino adipic acid, respectively. Error bars 
are s.e.m. and ny, is the Hill coefficient. 


B1-B2 and B8-B9 loops® (Fig. 2a and Supplementary Fig. 10). 
Hydrophobic residues in the M2-M3 loop mediating these interac- 
tions are well conserved in most Cys-loop receptors, consistent with 


the M2-M3 loop having a central role in the activation mechanism of 
receptors throughout the family?~*°. 


Pore conformation 


To test the hypothesis that the GluCl.,y.-ivermectin structure re- 
presents an open, conducting conformation (Fig. 4), we carried out 
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Figure 4 | Ion channel. a, Purple spheres represent internal surface of 
transmembrane ion channel, with side chains shown for pore-lining residues 
from two of the five M2 « helices that line the pore; Ser 260 does not line the 
pore but hydrogen bonds with ivermectin (IVM). b, Pore diameter is plotted as 
a function of longitudinal distance along the pore for GluClays,; open (GLIC; 
PDB code 3EAM) and closed (ELIC; PDB code 2VL0) bacterial receptors. 


functional and structural studies using picrotoxin, an open channel 
blocker*”*® (Fig. 5 and Supplementary Fig. 18). Electron density in 
picrotoxin-soaked crystals was apparent at a position near the cyto- 
solic side of the transmembrane pore, on the five-fold axis of molecu- 
lar symmetry (4.30; Supplementary Fig. 19). Thus, the observed 
electron density is an average of five orientations. Nevertheless, the 
egg-shaped picrotoxin-associated electron density indicates that the 
basket-like, fused tricyclic rings are directed extracellularly and near 
the 2’ Thr, whereas the isoprenyl tail points towards the cytoplasm 
and is proximal to the —2’ Pro residues. In this position, the majority 
of the oxygen atoms of picrotoxin are cradled by the polar belt of 
2' Thr hydroxyls whereas the hydrophobic isoprenyl moiety is sur- 
rounded by the methylene groups of the non-polar —2’ Pro side 
chains. Most importantly, the binding of picrotoxin to the pore of 
the GluCl.+ys:-ivermectin complex reinforces our hypothesis that the 
pore is in an open conformation. 

The smallest diameter of the GluCl.ys_ ion channel pore is ~4.6 A, 
defined by a hydrophobic ‘girdle’ of —2' Pro side chains proximal to 
the cytoplasmic side of the membrane. Because chloride has a Pauling 


In 


Figure 5 | Picrotoxin-binding site. a, The front two subunits have been 
removed to show the picrotoxin location (boxed) at the cytosolic base of the 
pore. Residues involved in picrotoxin binding are shown as sticks and van der 
Waals surfaces are shown for picrotoxin. b, View looking into pore from the 
extracellular domain at the picrotoxin position relative to the 2’ Thr and 

—2' Pro side chains. Picrotoxin is shown as sticks with carbon atoms in yellow 
and oxygen atoms in red. 
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radius of 1.8A (ref. 2), passage of chloride, iodide (Pauling radius of 
22A (ref. 2)) and other permeant ions through the —2’ Pro constric- 
tion must involve substantial dehydration, in agreement with pre- 
vious studies demonstrating a correlation between energies of 
hydration and relative permeabilities (higher for iodide than chloride; 
Supplementary Fig. 20)°. The pore constriction in GluC].,,.. is some- 
what smaller than that estimated for GABA,g, glycine and GluClB 
receptors (5.2-6.2 A)*°, based on low but measureable relative per- 
meability to ions like acetate, gluconate and phosphate. This differ- 
ence may be due to the alanine residues at the —2’ position in the B 
subunits of all three of those receptors. 

GluCl is related to the Torpedo nAChR (PDB code 2BG9)"° in 
amino acid sequence and three-dimensional structure and thus we 
compared the structures and the aligned sequences. In so doing, we 
found inconsistencies between amino acid sequence-based align- 
ments and three-dimensional structure-based alignments of the M2 
and M3 « helices. A similar finding was described in comparisons of 
GLIC to the nAChR’. Our analysis indicates that in the « subunit M2 
pore-lining helix and the M3 «-helix, the nAChR amino acid assign- 
ment is off in register by 4 residues or ~1 turn of an a-helix beginning 
with the M1-M2 loop (Supplementary Fig. 21). 


Ion selectivity 

Analysis of GluClouys surface electrostatics reveals an electropositive 
vestibule, a slightly electronegative extracellular half of the transmem- 
brane pore, and an electropositive intracellular half (Fig. 6a). None of 
the pore-lining residues in GluCl,.y._ bear a formal charge and thus 
the positive electrostatic potential at the base of the pore arises from 
the oriented peptide dipoles in the M2 « helices*’, reminiscent of the 
role that helical dipoles have in CIC chloride channels”. Cation channels 


P243, -2’ 


1242, -3’ 


Figure 6 | Ion selectivity. a, The front of the receptor is cut away to reveal the 
interior surface of the pore, coloured by electrostatic potential. Dashed circle in 
the pore indicates putative chloride-binding site. Boxed area is expanded in 
b. b, Expanded view from a showing selected M2 side chains from opposing 
subunits. Anomalous difference peaks at pore base are attributed to iodide- 
binding sites (light grey mesh, contoured at 3.50). F, — F- omit density for the 
putative chloride site is represented by yellow mesh contoured at 30. c, Shown 
are the electropositive pockets where iodide ions bind, viewed from inside the 
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reverse the selectivity imposed by orientation of the M2 dipoles through 
placing a negatively charged side chain near the pore constriction 
point. Although other regions contribute to the modulation of con- 
ductance and selectivity in some Cys-loop receptors, the minimal deter- 
minants of selectivity are the —1' Ala and —2' Pro positions for anions 
and the —1' Glu for cations***, with no requirement for positively 
charged amino acids in the pore of anion-selective channels” (Sup- 
plementary Fig. 22). 

To identify sites important in chloride binding and selectivity, we 
soaked crystals of GluCl,,,<¢ in iodide, a heavy atom analogue of chloride, 
and observed four anomalous difference peaks that we ascribe to iodide, 
located at the cytosolic base of the transmembrane pore and centred 
around the five-fold symmetry axis (Fig. 6b-d). The weak density at the 
fifth site is simply the consequence of an interfering lattice contact with 
an adjacent Fab. Each iodide sits in a concave pocket of positive electro- 
static potential formed by —2’ Pro residues from the M2 helices of 
adjacent subunits, main-chain backbone atoms of —1' Ala and —3’ Ile 
and the methyl group of —1’ Ala. All three of these residues are import- 
ant in selectivity for some receptors, with the —1’ position being an 
essential component of selectivity across the family’. Previous studies 
indicate that the main chain amide nitrogen at the —3’ position is 
important in GluClp receptors for anion dehydration”. In GluClayst 
this atom is ~5A from the centre of the iodide anomalous density 
and could form water-mediated hydrogen bonds to anions at the mouth 
of the ion channel pore. 

Electron density maps derived from all GluClays. X-ray diffraction 
data sets exhibit a spherical peak in the pore between the 2’ Thr and 
6’ Thr residues (6.80 in F, — F, omit maps) with no other peaks in the 
pore above 2.50. Anomalous difference electron density maps were 
inconclusive in identification of this peak. Therefore, we placed 


protein surface; four residues from adjacent M2 helices that coordinate the 
iodide sites are shown as sticks. d, Electropositive pockets viewed from the 
intracellular side. e, Putative chloride site viewed from the extracellular side of 
the pore with 6’ Thr residues in the foreground; F, — F, omit density (yellow) is 
contoured at 4c. Carbon atoms are coloured by chain and chloride is 
represented bya 1 A cyan sphere. Closest distances in A from protein atoms to 
centre of sphere are indicated by dashed lines from the 6’ Thr side-chain 
hydroxyl. 
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several different ions or molecules in the difference density and deter- 
mined, through crystallographic refinement and inspection of difference 
electron density maps, that a single chloride anion best accounted for 
this electron density feature (Supplementary Table 1). Distances 
between the modelled chloride and 6’ Thr side-chain hydroxyl oxygen 
atoms are consistent with water-mediated hydrogen bonding of 
chloride in the pore”, indicating that this location could be a transiently 
occupied ion-binding site flanking the constriction point. Further 
experimentation is required to validate the chemical identity of the 
bound species. 

The iodide-binding sites nestled in electropositive pockets at the 
base of the pore suggest general principles of ion selectivity in Cys- 
loop receptors. In GluCl and other chloride-selective receptors there is 
either an alanine or glycine residue at the — 1’ position of the M2 helix, 
thus preserving the concave pocket. By contrast, in eukaryotic cation- 
selective channels, the —1’ residue is a conserved glutamate. We 
suggest that the carboxylate side chain of glutamate not only fills 
the ‘anion pocket’ but that it also imposes a local negative electrostatic 
potential important for cation selectivity (Supplementary Fig. 22). 
Previous cysteine accessibility studies in cation-selective channels 
have indicated that the —1’ Glu position lines the transmembrane 
pore**. However, on the basis of the GluCl,y.¢ structure and amino 
acid sequence alignments, we propose that the preceding residue, a 
conserved glycine (—2’ residue), lines the pore of cation channels, 
consistent with the significantly larger pore diameter of cation channels 
(7.4-8.4 A). In support of the —2’ residue defining the pore constric- 
tion, deletion of the —2’ Pro in glycine receptors, which would shift the 
following glycine residue into the —2' position, increases pore diameter 
to6.9A (ref. 47). Furthermore, the —2' Gly position in cation-selective 
5-HT3, receptors is accessible to modification when the pore is open®. 
We propose that the —2' position lines the pore in both anion and 
cation channels and that the ‘anion pockets’ in GluCl.,y.¢ are important 
determinants of ion selectivity, increasing the local concentration of 
anions at the cytoplasmic mouth of the pore. 


Conclusion 

Here we present the first X-ray structure, to our knowledge, of a 
eukaryotic Cys-loop receptor, a glutamate-gated chloride channel 
from C. elegans. GluClys¢ was co-crystallized with ivermectin, a 
partial allosteric agonist that sequesters within the membrane bilayer 
and binds to exposed sites on the transmembrane domains of the 
receptor. Lipophilic modulators of other Cys-loop receptors may 
exploit a similar mechanism of interaction, including the neuro- 
steroids at the GABA, receptor® and cholesterol at the muscle 
nAChR”. The GluCl.,y¢-ivermectin structure maps a previously 
uncharacterized binding site at a protein-lipid interface and defines 
a protein/chemical scaffold for design of receptors and ligands with 
new pharmacological properties and receptor specificities. Binding of 
ivermectin induces local changes in the membrane domain and global 
conformational changes in the entire receptor, pre-organizing the 
agonist binding site ~30 A away and opening the ion channel pore. 
Analysis of amino acids lining and proximal to the pore indicates that 
anion selectivity is accomplished largely through a pore constriction 
imposed by proline residues and a positive electrostatic potential, 
conferred by the N-terminal end of the M2 helix dipoles. These 
new findings advance our understanding of the molecular mechanism 
of fast neuronal inhibition, the importance of which was first appre- 
ciated more than one hundred years ago”. 


METHODS SUMMARY 


GluCl,,y.¢ was expressed from baculovirus-infected Sf9 cells and purified by metal 
ion affinity chromatography. The Fab complex was isolated by size-exclusion 
chromatography. The GluCl,,y.;-Fab complex was concentrated to 1-2 mg ml! 
and supplemented with synthetic lipids and ivermectin. Crystallization was per- 
formed by hanging-drop vapour diffusion at 4°C with a precipitating solution 
containing 21-23% PEG 400, 50 mM sodium citrate pH 4.5 and 70mM sodium 
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chloride. Cryoprotection was achieved by soaking crystals in precipitant solution 
supplemented with 30% PEG 400. Additional complexes were obtained by soaking 
crystals in cryoprotectant containing L-glutamate, picrotoxin or sodium iodide. 
Diffraction data were indexed, integrated and scaled and the structure solved by 
molecular replacement using a GLIC-derived homology model of GluCl.,y.¢ and a 
Fab homology model as search probes. The molecular replacement phases were 
used to initiate autobuilding and the resulting model was iteratively improved by 
cycles of manual adjustment and crystallographic refinement. Function of GluCl 
was examined by two-electrode voltage clamp experiments and by [*H]-1-glutamate 
saturation and competition binding assays. 


Full Methods and any associated references are available in the online version of 
the paper at www.nature.com/nature. 
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METHODS 


Construct design. The gene encoding the full-length C. elegans GluCla protein 
(Genbank accession code AAA50785.1)*, including the native signal peptide and 
a C-terminal 8X-histidine tag, was codon optimized and subcloned into the 
pFastBacl vector for baculovirus-driven expression in Sf9 insect cells. A construct 
for FSEC-based” small-scale screening of detergent stability, mutagenesis and 
purification additionally contained the enhanced GFP (EGFP)-coding sequence 
inserted into the M3-M4 loop region as previously described*'*’. To improve 
crystallization behaviour, 41 amino acid residues from the N terminus and 6 from 
the C terminus were removed, and residues K345-K402 (in the mature, full- 
length sequence), corresponding to the M3-M4 loop, were substituted with the 
residues AGT. 

GluCl expression and purification. Bacmid and baculovirus were generated 
from pFastBacl constructs and Sf9 cells were infected at 27°C using standard 
methods. After 18 h of infection, cells were maintained shaking at 20 °C, and then 
harvested for purification after 72-96 h. Cells were collected by centrifugation at 
6,200g and disrupted using an EmulsiFlex-C5 (Avestin) in buffer containing 
20 mM Tris pH 7.4, 150mM NaCl (TBS buffer), and 1 mM PMSF. The homo- 
genate was clarified by centrifugation at 9,700g, and crude membranes were 
collected from the light membrane fraction by centrifugation at 125,000g. The 
membranes were mechanically homogenized and solubilized in 0.25g C,.M 
(n-dodecyl-B-D-maltopyranoside; Anatrace) per gram of membranes in TBS. 
Solubilized membranes were centrifuged at 125,000g. Supernatant containing 
GluC]l,,ys, was bound to TALON Co” * -affinity resin (Clontech), washed with 
TBS solution containing 1mM C,,M and 25mM imidazole, and eluted with 
250 mM imidazole. All purification steps were performed at 4 °C. 

Monoclonal antibody generation and Fab purification. The mouse monoclonal 
antibody against GluCl (IgGl, 4) was obtained using standard methods”. 
Specificity of the antibody for properly folded pentameric GluCl,,y. was assayed 
by FSEC and western blot. Cloning and sequencing of Fab antibody regions were 
performed from mouse hybridoma cells. Antibody was purified from hybridoma 
supernatants by cation exchange and protein A affinity chromatography. Fab 
fragments were generated by papain digest of whole antibody, and purified by 
protein A chromatography to remove Fc molecules and undigested material, 
followed by anion exchange. 

Purification of GluCloyst-Fab complex. Eluent from Co’ * -affinity purification 
and Fab from ion exchange were mixed to an excess of Fab to GluCl.,y. subunits, 
concentrated, and applied to a gel filtration column (Superose 6 10/300 GL, GE 
Healthcare Life Sciences) equilibrated in TBS + 1 mM C,2.M. GluCleys-Fab 
complex was concentrated to 1-2 mgml '. For samples used in crystallization, 
1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine (POPC) or 1,2-dipalmitoyl- 
sn-glycero-3-phosphocholine (DPPC) lipids (Avanti Polar Lipids) were added to 
0.02% from a 2% stock suspension in 20% DMSO, 80% gel filtration buffer, and 
ivermectin (Sigma) was added to 0.1 mM from a 10 mM stock in DMSO. 
Crystallization and cryoprotection. Initial crystallization attempts of GluCl 
constructs in the absence of Fab resulted in poorly diffracting (8 A) crystals that 
grew in a very limited range of crystallization conditions. Crystallization of the 
Fab complex occurred in diverse conditions; best diffracting crystals were 
obtained in hanging-drop format and diffracted to 4-5 A. Crystals diffracting 
beyond 4A were obtained only in the presence of Fab, either POPC or DPPC, 
and ivermectin. These tetragonal crystals grew by vapour diffusion at 4°C in 21- 
23% PEG 400, 50 mM sodium citrate pH 4.5, and 70mM sodium chloride, and 
diffracted maximally to Bragg spacings of 3.26A (Supplementary Table 1). 
Crystals were protected before flash freezing in liquid nitrogen by 1-2 min soaks 
in crystallization solution supplemented to contain 30% PEG 400. To obtain 
structures of GluCl,,. in complex with additional ligands, crystals of the same 
form were soaked briefly in cryoprotectant containing either 5mM picrotoxin 
(picrotoxinin, the more active component of picrotoxin, was used, obtained from 
Sigma), 50 mM L-glutamate or 300 mM sodium iodide. In an effort to minimize 
occupancy of chloride in the iodide-soaked crystals, crystals were transferred 
serially into three replicate cryoprotectant solutions lacking chloride before flash 
freezing. Nonetheless, because the iodide soaks were only 1-2 min, some chloride 
may have been carried over from crystallization. Electron density maps derived 
from these crystals showed no significant change in the strength of the electron 
density feature in the pore where we have modelled a chloride ion. We also soaked 
crystals in an analogous manner in bromide-containing cryo-solutions but were 
not able to observe significant peaks in the resultant anomalous difference elec- 
tron density maps. 

Data collection. Diffraction data were collected using synchrotron radiation at 
the Advanced Photon Source (Argonne National Laboratory, beamline 24-ID-C) 
with a mini-Kappa goniometer and in-house crystal alignment strategy software. 
The best-ordered crystals have a diffraction limit of 3.26 A, a mosaic spread of 
0.2-0.5°, and they are of the space group P4322 with one GluCl.,ys,-Fab complex 


per asymmetric unit. The unit cell dimensions are a= b = 155 A, c=575A, 
a= fh =y= 90°, resulting in a Matthews’ coefficient (Vi) of 4.0 A*Da_! (ref. 55). 
Diffraction data were indexed, integrated and scaled using HKL2000 (ref. 56) or 
Xia2 (refs 57-62) software. 

Structure determination. The structure was solved by molecular replacement 
using Phaser®; the search probe was a pentameric homology model of GluCl..y<t 
made from GLIC (PDB code 3EHZ)'°, using Swiss-Model™. After an initial 
solution was found, phases were improved by solvent flattening® and electron 
density for Fab molecules bound at each of the five subunit interfaces of GluClayst 
became plainly visible. A Fab homology model was made, using PDB 1NGQ for 
the light chain and 1F3D for the heavy chain, and Coot® to overlay the two 
modelled chains to make a single Fab molecule. Fab CDR loops were truncated 
and the model was used for molecular replacement using the GluCl.,y. solution 
as a starting point. In this manner, a single Fab was placed, and by copying the 
remaining Fab molecules around the fivefold non-crystallographic symmetry 
(NCS) axis, approximate positioning of all Fab molecules was accomplished. 
Electron density for Fab constant domain regions was poor after NCS averaging, 
and from non-averaged maps it was clear that the Fab constant domains did not 
obey five-fold symmetry. A starting model that included GluCl.rys¢ and five Fab 
variable domains was used for automated building with Buccaneer’. Electron 
density maps were then good enough to position ivermectin molecules in the 
transmembrane domain loci, and to begin manual building of the Fab constant 
domains. Ivermectin stereochemistry, determined previously, is modelled as 
such®*. Numbering of ivermectin atoms in the figures is as defined in the PDB 
files; Supplementary Table 2 relates this numbering to that from the small mol- 
ecule structure. 

Iterative refinement of the model against the X-ray data using Phenix”, manual 
adjustment in Coot into simulated annealing composite omit electron density 
maps” or real-space averaged maps”, and structure quality analysis using 
Molprobity” were carried out until satisfactory model statistics were obtained. 
Three groups of fivefold NCS restraints were present during refinement: five 
subunits of GluCl..ys, five heavy chain Fab variable domains (residues 1-120), 
and the five light chain Fab variable domains (residues 1-108); the root mean 
squared deviation (r.m.s.d.) values between the chains within each of these three 
groups were 0.017, 0.014 and 0.015 A, respectively. Isotropic B factors and TLS 
parameters were also refined; the 15 TLS groups comprised five GluC].,ys_ sub- 
units, five Fab variable domains, and five Fab constant domains. The final models 
contain the GluCl.,.ys, pentamer from residues 1-339 or 340, five ivermectin 
molecules, a single N-linked carbohydrate at N185 in three of the five subunits, 
five Fab molecules (1-221 for heavy chains, 1-210 for light chains), and several 
lipid and detergent molecules. Some portions of the Fab constant domains lacked 
electron density in composite omit maps and hence were omitted from the final 
model. The iodide-bound structure is of very low resolution and not completely 
refined: several anomalous difference electron density peaks in the extracellular 
domain were not modelled with iodide atoms. 

Sequence alignments were made using PROMALS3D (ref. 72) and ClustalW”. 
Isoelectric surface calculations were made using the APBS” add-on in PYMOL”. 
Pore dimensions were analysed using HOLE software”®. 

Electrophysiology. RNAs encoding GluCl proteins were transcribed from 
pGEM-HE?” plasmids using the mMessage mMachine T7 Ultra kit (Ambion). 
Defolliculated stage V-VI Xenopus oocytes were provided by D. C. Dawson and 
C. Alexander, prepared as previously described’*. Oocytes were injected with 
25 ng of GluCl RNAs, and current recordings were made 3-5 days afterwards. 
Frog saline (FS) recording solution contained (in mM): 96 NaCl, 2 KCl, 1 MgCl, 
1.8 CaCl, 5 HEPES pH 7.5. Recording solution for iodide permeability experi- 
ments was FS but with Nal in place of NaCl. All ligands were made up in FS from 
stock solutions in water, except: picrotoxin, 1M stock in DMSO; ivermectin, 
5 mM stock in DMSO. Recording electrode pipettes (0.7-2 MQ) were cushioned 
with 0.8% LMP agarose in 3 M KCl and backfilled with 3 M KCl. Oocytes were 
voltage-clamped at —80mV except in experiments to determine the reversal 
potential, which used 40 ms voltage steps. Analogue data were filtered at 50 Hz 
and digitized at =1kHz. The Axoclamp 2B amplifier (Axon Instruments) and 
pClamp 10 software (Molecular Devices) were used for data acquisition. In un- 
injected oocytes, no significant responses to test solutions were observed (Sup- 
plementary Fig. 13). 

Radioligand binding experiments. Experiments to test binding of [*HI-1- 
glutamate to GluCl.,ys, and competition of the radioligand with other compounds 
were done using purified Nanol5-tagged” GluCl..ys. (N-terminal tag) and 
streptavidin-Ysi scintillation proximity assay beads (SPA; GE Healthcare Life 
Sciences). The concentration of binding sites was fixed at 100nM after a pre- 
liminary experiment to determine optimal GluCl,,y.¢ concentration (Supplemen- 
tary Fig. 12). Other binding assay components were: 50 mM Tris pH 7.4, 150 mM 
NaCl, 1mM Cj.M, 1mg ml! SPA beads, and 1 uM ivermectin. Saturation 
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binding of [*H]-t-glutamate in the presence and absence of Fab was performed 
with a 1:30 dilution of specific activity of the radiolabel with ['H]-L-glutamate, 
and a slight molar excess of Fab to GluCl.ryst subunits as verified by FSEC experi- 
ments. Measurement of background signal in saturation binding experiments was 
complicated by, we believe, significant binding of [PH)-L-glutamate directly to 
SPA beads and lack of a chemically distinct competitor for the neurotransmitter- 
binding site. Neither high concentrations of ['H]-L-glutamate or absence of 
protein were able to fully account for this apparently non-specific signal. To 
address the background component that was not accurately measured experi- 
mentally, we combined subtraction of a background signal measured in the 
absence of GluCl.,y.¢ with a linear component still present in the binding data 
(calculated using the total binding function in the fitting software). In saturation 
binding experiments in the presence of Fab (Supplementary Fig. 12), data were 
better fit after removing background signal measured in the presence of 10 mM 
[‘H]-1-glutamate combined with the calculated linear component. In competi- 
tion binding experiments to determine ICso values, [*H]-1-glutamate was 1 1M 
using a 1:10 dilution of specific activity of the radiolabel with cold glutamate. In all 
[PH]-L-glutamate and electrophysiological dose-response experiments, data were 
fit with GraphPad Prism software. 
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